Julian Seward wrote:
> Yes, I guess so.  Although I'm not sure how this could be implemented.
> The trick with the pipe in sema.c implements a lock well enough, but how
> does one implement a lock in which (1) the unlocker can control which
> thread gets the lock next, and (2) that signalling is somehow implied to
> the kernel, so as to force it to "decide" on our behalf?  I'm not sure.
> 
> Also, deciding ourselves which thread is next to run is dangerous.
> If we decide to run thread X (and cause all the rest to be blocked), but
> for some reason that we don't anticipate, the kernel believes X to be 
> blocked, then the system will deadlock.

Are you in "user code" or in "kernel code" ?  The point I am making is 
that all system calls must be allowed to run immediately and allowed to 
block inside the kernel; this means the act of invoking a kernel call 
must always release/ensure at least one other thread that should be 
running in "user code" is unblocked to run (just in case this thread 
blocks inside the kernel).

Then on the return from the system call the current thread checks to see 
if it is still allowed to run, if it is not allowed to run it suspends 
itself.

So at all times where thread runnability is being manipulated 
(specifically being artificially suspended) we are never in "kernel 
code" and always in "user code".

Following the above rules there is no chance of deadlock that I can see. 
  The deadlock I am referring to is the one where a system call blocks 
and it was the only thread running in the process and that situation is 
due to artificial manipulation by valgrind of runability.  Where as the 
applications view of the runtime behavior at that moment is that at 
least one other thread is runnable (and should be running).


There is also the possibility of valgrind creating a thread management 
thread that always runs, but it would ideally to be hidden to the 
application.  This maybe hard to do.



> You may remember, a long time ago (in 2.4.0 days) there was at one point
> a patch (committed, and later removed) which changed the exit order of 
> threads in so that (iirc) the root thread for the process was
> not allowed to exit until all other threads had exited.  This was in order
> that shell scripts, etc, waiting for the root thread to finish, would be
> guaranteed that the final error messages, stats, etc, were printed before
> the root thread exited.  But this effectively added inter-thread dependencies
> which were not present in the original program, and caused complex threaded
> apps to sometimes deadlock at exit, most notably OpenOffice.

Are you sure this patch was only needed due to the way Linux 
2.2/2.4/early-2.6 worked with LinuxThreads.  i.e. each thread has a 
unique PID which meant that once the main thread died things could go 
wrong for waitpid() which shell scripts use in the way you describe.

But since NPTL later-2.6 all threads in the process share the PID and 
only when ALL threads exit does the process exit.  I think this also 
means that the main thread can exit early; but in practice most 
applications like the concept of a "main thread" to hang everything else 
off.


> As a result of that I'm leery about screwing with thread scheduling at
> all.  IMO we should leave that stuff entirely to the kernel.
> 
> I think the original poster (Felix) would do well to explain what
> problem he is trying to solve by scheduling threads himself.  Perhaps
> then we could think of some other solution.

Yes maybe there is some millage in this too, maybe the Kernel has an API 
already to do the sort of thread manipulation we are looking for and we 
just don't know it.

That is the ability to control/suspend thread execution/scheduling 
within a process (possibly itself, possibly another process) but doing 
so in a way that is not observable to the application.  This might have 
other uses outside debugging but mainly targeted at debugging.

One basic idea like this might be for the kernel to send a signal to the 
process whenever a kernel needs to block (i.e. put the process to 
sleep).  But that signal delivery must be synchronous (i.e. it must 
occur and complete before that thread is allowed to run again), which I 
think is pretty much an anti-pattern for the way signals work, they'd 
have to create as new task runnable state 
"blocked-pending-synchronous-signal-delivery".  Also a sync signal queue 
might need to take priority over the async signal queue.

Then you've got the desire for basic control to hint/force runnability 
control.

In short the above is basically the fictitious API I wrote out (in my 
previous email to this thread) but implemented in the context of kernel 
verses userspace (as opposed to application verses valgrind).  Where the 
callback function I defined my_scheduler_implementation() becomes a 
signal handler and all the vg_thread_control() stuff become system calls.


Darryl

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to