Re: forkProcess#

Wolfgang Thaller Wed, 24 Sep 2003 08:29:43 -0700

OK, now about the threaded RTS. Here, it is probably possible to implement it 100%, but it is very complex and probably requires doing some more bookkeeping for every foreign call. I feel that it's just not worth the effort.

In POSIX threads, fork() creates a new process containing a copy of the
current OS thread (only).  So, AFAICT, this means that

  (a) the RTS would need to adjust its idea of what worker threads
      are available.

  (b) if the Haskell thread that forked then exits:
      (1) if was not a main thread, the RTS should exit
      (2) if was a main thread, then
           (i)  if it was/is bound, we are ok, just return
           (ii) if it was/is not bound, we have no OS thread to return
                to, so the RTS must stop.

Keep in mind that main thread == bound thread in the new threaded RTS; every callin into Haskell, including the execution of main, is bound. Every bound thread has its corresponding StgMainThread.

(b) When we fork, we need to distinguish several situations: The Haskell thread that forks... (1) ... is not bound, that is, a thread created using forkIO we will want to "convert" it into a bound thread from now on, as the OS thread that remains in the forked process has the special property that the process exits when that OS thread exits. (2) ... is bound and has been created (directly) using forkOS When the Haskell thread terminates, the OS thread will just terminate, and with it the whole forked process. This is good, but hs_exit() is not called, which is bad (streams are not flushed, etc.) (3) ... is bound and has been created by a call-in from a C routine ... (i) ... that has in turn been foreign-called from Haskell Sooner or later, the Haskell thread will return to C. Later, the C function will return to Haskell. The TSO associated with that Haskell thread (BlockedOnCCall) will still have to be around. (ii) ... that runs in its own OS thread which hasn't seen any Haskell code before (e.g. created by a library using pthread_create).

ad (1): I'm not sure how to do that converting right now (definitely possible, though). A scheduler loop (= an invocation of schedule()) knows whether it's a worker thread loop or a bound thread loop because it gets passed a StgMainThread as it's parameter in the latter case. It would be necessary to replace this parameter with a thread-local-state variable or (to avoid performance costs) to "retroactively" change this parameter using some evil trickery involving a thread local state variable with a pointer to the "current scheduler loop"'s parameter.

ad (2): This can probably be solved by using atexit() or by adding some extra code to the forkOS "wrapper".

ad (3)(i): We may kill all other StgMainThreads that are not BlockedOnCCall. But some of the blocked main threads might be threads that we will want to return to, so deleting them is fatal (if we ever return, rather than execing right away). Not deleting StgMainThreads that are BlockedOnCCall is a bad memory leak, but otherwise non-fatal. We could distinguish between the two by storing OSThreadIds in the StgMainThreads whenever we do a safe foreign call. From a performance point of view, getting the current OS thread ID is the worst thing we could do, but we're talking about yet another kernel call for every foreign callout.

ad (3)(ii): We might have the same problem as in (2), but apart from that, everything is fine (as long as the foreign code plays along nicely).

I don't see any real problems here, but I'm probably missing something.

No "real" problems, but lots of special cases and lots of places where we could introduce bugs. Also, I have a strong feeling that almost everyone uses fork only in connection with exec, so it's really a lot of effort for something that hardly anyone ever uses.

Could you explain how your modified version of forkProcess# solves the
problems?

It circumvents them by being much simpler.

forkProcess' :: IO () -> IO ProcessID

The main difference is that the action to be executed in the child process is guaranteed not to return to some thread that we might have deleted. So in the child process, we just do the following: - delete _all_ Haskell threads - make note of the fact that we have no worker threads any more - "foreign-call" the action that we are supposed to execute (it is run in a new Haskell thread bound to the current OS thread) - clean up and exit()

It works for both the threaded RTS and the non-threaded RTS (for the non-threaded case, ignore the phrase "bound to the current OS thread").

I'm not aware of any case where the more limiting type signature of forkProcess' would be a problem, so I really like that simplicity.

Cheers,
Wolfgang

_______________________________________________
Cvs-ghc mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Re: forkProcess#

Reply via email to