Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-21 Thread Alexander Kjeldaas
I think you can test this theory with this patch.  If a thread is waiting
on the task-cond condition variable which is matched up with task-lock,
then pthread_cond_destroy will return EBUSY, which must always be a bug in
the RTS.

Alexander

diff --git a/rts/posix/OSThreads.c b/rts/posix/OSThreads.c
index ae31966..0f12830 100644
--- a/rts/posix/OSThreads.c
+++ b/rts/posix/OSThreads.c
@@ -91,7 +91,8 @@ initCondition( Condition* pCond )
 void
 closeCondition( Condition* pCond )
 {
-  pthread_cond_destroy(pCond);
+  int ret = pthread_cond_destroy(pCond);
+  CHECKM(ret == 0, RTS BUG! Someone is waiting on condvar %d., ret);
   return;
 }



On Mon, Jan 21, 2013 at 8:18 AM, Alexander Kjeldaas 
alexander.kjeld...@gmail.com wrote:


 I just looked at this code and since I don't know the code I can't give
 you good solutions, but for others watching this thread the links might
 prove interesting.

 My main theory is that you do have some other thread in FFI-land while you
 are fork()ing.  The task-cond, task-lock seems to be related to this (see
 quoted comments below).

 Also, pthread_mutex_destroy is undefined if the lock is locked, so I am
 guessing that the task-lock is somehow locked when it shouldn't be.

 It isn't clear from your description whether this is consistently
 happening on Linux, or whether this only sometimes happens.

 The forkProcess() code seems to hold all capabilities during fork, but
 that does not include FFI-land threads AFAIU.

 Assuming that this happens only rarely, I am trying to understand what
 happens if the thread that is in FFI-land returns to the RTS (in the
 parent) after fork(), but before the freeTask() in the child.  Based on the
 descriptions I read, it seems likely that this thread will try to inspect
 task-cap, which requires holding task-lock.

 That would in turn make the pthread_mutex_destroy in the child invalid.

 https://github.com/ghc/ghc/blob/master/rts/Task.h#L57

 
  ...
  When a task is migrated from sleeping on one Capability to another,
its task-cap field must be modified.  When the task wakes up, it
will read the new value of task-cap to find out which Capability
it belongs to.  Hence some synchronisation is required on
task-cap, and this is why we have task-lock.

If the Task is not currently owned by task-id, then the thread is
either

  (a) waiting on the condition task-cond.  The Task is either
  (1) a bound Task, the TSO will be on a queue somewhere
  (2) a worker task, on the spare_workers queue of task-cap.
...
 

 freeTask:
 https://github.com/ghc/ghc/blob/master/rts/Task.c#L142

 the comment in freeTask refers to this test:

 https://github.com/ghc/testsuite/blob/master/tests/concurrent/should_run/conc059.hs

 That test calls the RTC from C which then forkIOs off actions that are
 outstanding when the RTS exits.

 in forkProcess, child code
 https://github.com/ghc/ghc/blob/master/rts/Schedule.c#L1837

 It look like all this code supports the notion that some other thread can
 be in foreign code during the fork call.

 discardTasksExcept
 https://github.com/ghc/ghc/blob/master/rts/Task.c#L305


 Alexander


 On Mon, Jan 21, 2013 at 12:15 AM, Mark Lentczner mark.lentcz...@gmail.com
  wrote:

 Sorry to be reviving this thread so long after but I seem to be
 running into similar issues as Michael S. did at the start.

 In short, I'm using forkProcess with the threaded RTS, and see occasional
 hangs:

- I see these only on Linux. On Mac OS X, I never do.
- I'm using GHC 7.4.2
- I noticed the warning in the doc for forkProcess, but assumed I was
safe, as I wasn't holding any shared resources at the time of the fork, 
 and
no shared resources in the program are used in the child.
- WIth gdb, I've traced the hang to here in the run-time: forkProcess
 discardTasksExcept  freeTask  closeMutex(task-lock)
 pthread_mutex_destroy

 The discussion in this thread leaves me with these questions:

- Is there reason to think the situation has gotten better in 7.6 and
later?
- Isn't the only reason *System.Process* is safer because it does an
immediate exec in the child? Alas, I really want to just fork()sometimes.
- Is it really true that even if my program has no shared resources
with the child, that the IO subsystem and FFI system do anyway? Surely the
RTS would take care of doing the right thing with those, no?
- There should be no concern with exec w.r.t. library invariants
since exec is wholesale replacement - all the libraries will
reinitialize. Is there a problem here I'm missing?

 Alas, I've stopped using the threaded RTS until I understand this better.

 - Mark



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-21 Thread Alexander Kjeldaas
Or this.  It seems that you must compile with DEBUG for the mutex check.
 This enables error-checking mutexes on posix.

Alexander

diff --git a/rts/posix/OSThreads.c b/rts/posix/OSThreads.c
index ae31966..e07221d 100644
--- a/rts/posix/OSThreads.c
+++ b/rts/posix/OSThreads.c
@@ -91,7 +91,8 @@ initCondition( Condition* pCond )
 void
 closeCondition( Condition* pCond )
 {
-  pthread_cond_destroy(pCond);
+  int ret = pthread_cond_destroy(pCond);
+  CHECKM(ret == 0, RTS Bug! Someone is waiting on condvar ret=%d., ret);
   return;
 }

@@ -165,7 +166,8 @@ initMutex(Mutex* pMut)
 void
 closeMutex(Mutex* pMut)
 {
-pthread_mutex_destroy(pMut);
+int ret = pthread_mutex_destroy(pMut);
+CHECKM(ret == 0, RTS Bug! Destroying held mutex ret=%d, ret);
 }

 void


On Mon, Jan 21, 2013 at 10:14 AM, Alexander Kjeldaas 
alexander.kjeld...@gmail.com wrote:

 I think you can test this theory with this patch.  If a thread is waiting
 on the task-cond condition variable which is matched up with task-lock,
 then pthread_cond_destroy will return EBUSY, which must always be a bug in
 the RTS.

 Alexander

 diff --git a/rts/posix/OSThreads.c b/rts/posix/OSThreads.c
 index ae31966..0f12830 100644
 --- a/rts/posix/OSThreads.c
 +++ b/rts/posix/OSThreads.c
 @@ -91,7 +91,8 @@ initCondition( Condition* pCond )
  void
  closeCondition( Condition* pCond )
  {
 -  pthread_cond_destroy(pCond);
 +  int ret = pthread_cond_destroy(pCond);
 +  CHECKM(ret == 0, RTS BUG! Someone is waiting on condvar %d., ret);
return;
  }



 On Mon, Jan 21, 2013 at 8:18 AM, Alexander Kjeldaas 
 alexander.kjeld...@gmail.com wrote:


 I just looked at this code and since I don't know the code I can't give
 you good solutions, but for others watching this thread the links might
 prove interesting.

 My main theory is that you do have some other thread in FFI-land while
 you are fork()ing.  The task-cond, task-lock seems to be related to this
 (see quoted comments below).

 Also, pthread_mutex_destroy is undefined if the lock is locked, so I am
 guessing that the task-lock is somehow locked when it shouldn't be.

 It isn't clear from your description whether this is consistently
 happening on Linux, or whether this only sometimes happens.

 The forkProcess() code seems to hold all capabilities during fork, but
 that does not include FFI-land threads AFAIU.

 Assuming that this happens only rarely, I am trying to understand what
 happens if the thread that is in FFI-land returns to the RTS (in the
 parent) after fork(), but before the freeTask() in the child.  Based on the
 descriptions I read, it seems likely that this thread will try to inspect
 task-cap, which requires holding task-lock.

 That would in turn make the pthread_mutex_destroy in the child invalid.

 https://github.com/ghc/ghc/blob/master/rts/Task.h#L57

 
  ...
  When a task is migrated from sleeping on one Capability to another,
its task-cap field must be modified.  When the task wakes up, it
will read the new value of task-cap to find out which Capability
it belongs to.  Hence some synchronisation is required on
task-cap, and this is why we have task-lock.

If the Task is not currently owned by task-id, then the thread is
either

  (a) waiting on the condition task-cond.  The Task is either
  (1) a bound Task, the TSO will be on a queue somewhere
  (2) a worker task, on the spare_workers queue of task-cap.
...
 

 freeTask:
 https://github.com/ghc/ghc/blob/master/rts/Task.c#L142

 the comment in freeTask refers to this test:

 https://github.com/ghc/testsuite/blob/master/tests/concurrent/should_run/conc059.hs

 That test calls the RTC from C which then forkIOs off actions that are
 outstanding when the RTS exits.

 in forkProcess, child code
 https://github.com/ghc/ghc/blob/master/rts/Schedule.c#L1837

 It look like all this code supports the notion that some other thread can
 be in foreign code during the fork call.

 discardTasksExcept
 https://github.com/ghc/ghc/blob/master/rts/Task.c#L305


 Alexander


 On Mon, Jan 21, 2013 at 12:15 AM, Mark Lentczner 
 mark.lentcz...@gmail.com wrote:

 Sorry to be reviving this thread so long after but I seem to be
 running into similar issues as Michael S. did at the start.

 In short, I'm using forkProcess with the threaded RTS, and see
 occasional hangs:

- I see these only on Linux. On Mac OS X, I never do.
- I'm using GHC 7.4.2
- I noticed the warning in the doc for forkProcess, but assumed I
was safe, as I wasn't holding any shared resources at the time of the 
 fork,
and no shared resources in the program are used in the child.
- WIth gdb, I've traced the hang to here in the run-time: forkProcess
 discardTasksExcept  freeTask  closeMutex(task-lock)
 pthread_mutex_destroy

 The discussion in this thread leaves me with these questions:

- Is there reason to think the situation has gotten better in 7.6
and later?
- Isn't the only reason *System.Process* 

Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-21 Thread Chris Dornan
I am also sorry to be late on this but I have run into the same problem
trying to demonise a programme on 7.4.2. My solution was to get a shell
wrapper to run the daemon in debug mode (I.e., sans fork) and get the shell
script to do the demonising.

Other than this I have found the threaded RTS to be sound and I am quite
reliant on it. So, where things that run ‹threaded are concerned, no
forkProcess calls for me until I can better understand this better.

If anybody does think they understand what is going on here it would be
great if they could write it up. IMHO, either the current notes on
forkProcess don't go far enough, or there is a bug needing a workaround
until the platform gets fixed.

Chris

From:  Mark Lentczner mark.lentcz...@gmail.com
Date:  Sunday, 20 January 2013 23:15
To:  haskell haskell-cafe@haskell.org
Cc:  Mike Meyer m...@mired.org
Subject:  Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

Sorry to be reviving this thread so long after but I seem to be running
into similar issues as Michael S. did at the start.

In short, I'm using forkProcess with the threaded RTS, and see occasional
hangs:
* I see these only on Linux. On Mac OS X, I never do.
* I'm using GHC 7.4.2
* I noticed the warning in the doc for forkProcess, but assumed I was safe,
as I wasn't holding any shared resources at the time of the fork, and no
shared resources in the program are used in the child.
* WIth gdb, I've traced the hang to here in the run-time: forkProcess 
discardTasksExcept  freeTask  closeMutex(task-lock) 
pthread_mutex_destroy
The discussion in this thread leaves me with these questions:
* Is there reason to think the situation has gotten better in 7.6 and later?
* Isn't the only reason System.Process is safer because it does an immediate
exec in the child? Alas, I really want to just fork() sometimes.
* Is it really true that even if my program has no shared resources with the
child, that the IO subsystem and FFI system do anyway? Surely the RTS would
take care of doing the right thing with those, no?
* There should be no concern with exec w.r.t. library invariants since exec
is wholesale replacement - all the libraries will reinitialize. Is there a
problem here I'm missing?
Alas, I've stopped using the threaded RTS until I understand this better.

- Mark
___ Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-21 Thread Alexander Kjeldaas
While trying to dig around this morning I started adding clang-style thread
locking annotations to the source code.  These can be very handy and I
found at least one place where the documented locking policy doesn't seem
to match what is happening.

Here is an example with annotations, and what might or might not be a bug.
 With these annotations, clang will be able to prove whether the program
obeys the locking regime or not.

But this is of course only one part of the RTS, but the locking can be
pretty interesting in itself.

Does anyone else feel that this sort of annotations would help?

Alexander

diff --git a/rts/Task.c b/rts/Task.c
index e6781a1..1e499dc 100644
--- a/rts/Task.c
+++ b/rts/Task.c
@@ -25,12 +25,12 @@

 // Task lists and global counters.
 // Locks required: all_tasks_mutex.
-Task *all_tasks = NULL;
+Task *all_tasks GUARDED_BY(all_tasks_mutex) = NULL;

-nat taskCount;
-nat workerCount;
-nat currentWorkerCount;
-nat peakWorkerCount;
+nat taskCount GUARDED_BY(all_tasks_mutex);
+nat workerCount GUARDED_BY(all_tasks_mutex);
+nat currentWorkerCount GUARDED_BY(all_tasks_mutex);
+nat peakWorkerCount GUARDED_BY(all_tasks_mutex);

 static int tasksInitialized = 0;

@@ -339,9 +339,11 @@ void updateCapabilityRefs (void)
 ACQUIRE_LOCK(all_tasks_mutex);

 for (task = all_tasks; task != NULL; task=task-all_next) {
+ACQUIRE_LOCK(task-lock);
 if (task-cap != NULL) {
 task-cap = capabilities[task-cap-no];
 }
+RELEASE_LOCK(task-lock);

 for (incall = task-incall; incall != NULL; incall =
incall-prev_stack) {
 if (incall-suspended_cap != NULL) {



On Mon, Jan 21, 2013 at 2:14 PM, Chris Dornan ch...@chrisdornan.com wrote:

 I am also sorry to be late on this but I have run into the same problem
 trying to demonise a programme on 7.4.2. My solution was to get a shell
 wrapper to run the daemon in debug mode (I.e., sans fork) and get the shell
 script to do the demonising.

 Other than this I have found the threaded RTS to be sound and I am quite
 reliant on it. So, where things that run —threaded are concerned, no
 forkProcess calls for me until I can better understand this better.

 If anybody does think they understand what is going on here it would be
 great if they could write it up. IMHO, either the current notes on
 forkProcess don't go far enough, or there is a bug needing a workaround
 until the platform gets fixed.

 Chris

 From: Mark Lentczner mark.lentcz...@gmail.com
 Date: Sunday, 20 January 2013 23:15
 To: haskell haskell-cafe@haskell.org
 Cc: Mike Meyer m...@mired.org
 Subject: Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

 Sorry to be reviving this thread so long after but I seem to be
 running into similar issues as Michael S. did at the start.

 In short, I'm using forkProcess with the threaded RTS, and see occasional
 hangs:

- I see these only on Linux. On Mac OS X, I never do.
- I'm using GHC 7.4.2
- I noticed the warning in the doc for forkProcess, but assumed I was
safe, as I wasn't holding any shared resources at the time of the fork, and
no shared resources in the program are used in the child.
- WIth gdb, I've traced the hang to here in the run-time: forkProcess
 discardTasksExcept  freeTask  closeMutex(task-lock)
 pthread_mutex_destroy

 The discussion in this thread leaves me with these questions:

- Is there reason to think the situation has gotten better in 7.6 and
later?
- Isn't the only reason *System.Process* is safer because it does an
immediate exec in the child? Alas, I really want to just fork()sometimes.
- Is it really true that even if my program has no shared resources
with the child, that the IO subsystem and FFI system do anyway? Surely the
RTS would take care of doing the right thing with those, no?
- There should be no concern with exec w.r.t. library invariants since
exec is wholesale replacement - all the libraries will reinitialize.
Is there a problem here I'm missing?

 Alas, I've stopped using the threaded RTS until I understand this better.

 - Mark
 ___ Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-20 Thread Mark Lentczner
Sorry to be reviving this thread so long after but I seem to be running
into similar issues as Michael S. did at the start.

In short, I'm using forkProcess with the threaded RTS, and see occasional
hangs:

   - I see these only on Linux. On Mac OS X, I never do.
   - I'm using GHC 7.4.2
   - I noticed the warning in the doc for forkProcess, but assumed I was
   safe, as I wasn't holding any shared resources at the time of the fork, and
   no shared resources in the program are used in the child.
   - WIth gdb, I've traced the hang to here in the run-time: forkProcess 
   discardTasksExcept  freeTask  closeMutex(task-lock)
pthread_mutex_destroy

The discussion in this thread leaves me with these questions:

   - Is there reason to think the situation has gotten better in 7.6 and
   later?
   - Isn't the only reason *System.Process* is safer because it does an
   immediate exec in the child? Alas, I really want to just fork()sometimes.
   - Is it really true that even if my program has no shared resources with
   the child, that the IO subsystem and FFI system do anyway? Surely the RTS
   would take care of doing the right thing with those, no?
   - There should be no concern with exec w.r.t. library invariants since
   exec is wholesale replacement - all the libraries will reinitialize. Is
   there a problem here I'm missing?

Alas, I've stopped using the threaded RTS until I understand this better.

- Mark
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-20 Thread Alexander Kjeldaas
I just looked at this code and since I don't know the code I can't give you
good solutions, but for others watching this thread the links might prove
interesting.

My main theory is that you do have some other thread in FFI-land while you
are fork()ing.  The task-cond, task-lock seems to be related to this (see
quoted comments below).

Also, pthread_mutex_destroy is undefined if the lock is locked, so I am
guessing that the task-lock is somehow locked when it shouldn't be.

It isn't clear from your description whether this is consistently happening
on Linux, or whether this only sometimes happens.

The forkProcess() code seems to hold all capabilities during fork, but that
does not include FFI-land threads AFAIU.

Assuming that this happens only rarely, I am trying to understand what
happens if the thread that is in FFI-land returns to the RTS (in the
parent) after fork(), but before the freeTask() in the child.  Based on the
descriptions I read, it seems likely that this thread will try to inspect
task-cap, which requires holding task-lock.

That would in turn make the pthread_mutex_destroy in the child invalid.

https://github.com/ghc/ghc/blob/master/rts/Task.h#L57


 ...
 When a task is migrated from sleeping on one Capability to another,
   its task-cap field must be modified.  When the task wakes up, it
   will read the new value of task-cap to find out which Capability
   it belongs to.  Hence some synchronisation is required on
   task-cap, and this is why we have task-lock.

   If the Task is not currently owned by task-id, then the thread is
   either

 (a) waiting on the condition task-cond.  The Task is either
 (1) a bound Task, the TSO will be on a queue somewhere
 (2) a worker task, on the spare_workers queue of task-cap.
   ...


freeTask:
https://github.com/ghc/ghc/blob/master/rts/Task.c#L142

the comment in freeTask refers to this test:
https://github.com/ghc/testsuite/blob/master/tests/concurrent/should_run/conc059.hs

That test calls the RTC from C which then forkIOs off actions that are
outstanding when the RTS exits.

in forkProcess, child code
https://github.com/ghc/ghc/blob/master/rts/Schedule.c#L1837

It look like all this code supports the notion that some other thread can
be in foreign code during the fork call.

discardTasksExcept
https://github.com/ghc/ghc/blob/master/rts/Task.c#L305


Alexander


On Mon, Jan 21, 2013 at 12:15 AM, Mark Lentczner
mark.lentcz...@gmail.comwrote:

 Sorry to be reviving this thread so long after but I seem to be
 running into similar issues as Michael S. did at the start.

 In short, I'm using forkProcess with the threaded RTS, and see occasional
 hangs:

- I see these only on Linux. On Mac OS X, I never do.
- I'm using GHC 7.4.2
- I noticed the warning in the doc for forkProcess, but assumed I was
safe, as I wasn't holding any shared resources at the time of the fork, and
no shared resources in the program are used in the child.
- WIth gdb, I've traced the hang to here in the run-time: forkProcess
 discardTasksExcept  freeTask  closeMutex(task-lock)
 pthread_mutex_destroy

 The discussion in this thread leaves me with these questions:

- Is there reason to think the situation has gotten better in 7.6 and
later?
- Isn't the only reason *System.Process* is safer because it does an
immediate exec in the child? Alas, I really want to just fork()sometimes.
- Is it really true that even if my program has no shared resources
with the child, that the IO subsystem and FFI system do anyway? Surely the
RTS would take care of doing the right thing with those, no?
- There should be no concern with exec w.r.t. library invariants since
exec is wholesale replacement - all the libraries will reinitialize.
Is there a problem here I'm missing?

 Alas, I've stopped using the threaded RTS until I understand this better.

 - Mark

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2013-01-20 Thread Alexander Kjeldaas
On Mon, Jan 21, 2013 at 12:15 AM, Mark Lentczner
mark.lentcz...@gmail.comwrote:

 Sorry to be reviving this thread so long after but I seem to be
 running into similar issues as Michael S. did at the start.

 In short, I'm using forkProcess with the threaded RTS, and see occasional
 hangs:

- I see these only on Linux. On Mac OS X, I never do.

 Previous versions of the linux pthreads library didn't hold any shared
resources in locks, so pthread_mutex_destroy could not hang.  Now Linux is
much improved, and thus it hangs (see pthread_mutex_destroy man page) ;-).


- I'm using GHC 7.4.2
- I noticed the warning in the doc for forkProcess, but assumed I was
safe, as I wasn't holding any shared resources at the time of the fork, and
no shared resources in the program are used in the child.
- WIth gdb, I've traced the hang to here in the run-time: forkProcess
 discardTasksExcept  freeTask  closeMutex(task-lock)
 pthread_mutex_destroy

 The discussion in this thread leaves me with these questions:

- Is there reason to think the situation has gotten better in 7.6 and
later?
- Isn't the only reason *System.Process* is safer because it does an
immediate exec in the child? Alas, I really want to just fork()sometimes.

 If you immediately do exec() in the child, you can use vfork() which
blocks the parent.  This serializes the actions and makes this whole mess
smooth and consistent.



- Is it really true that even if my program has no shared resources
with the child, that the IO subsystem and FFI system do anyway? Surely the
RTS would take care of doing the right thing with those, no?

 It looks like the RTS is trying to do a lot of things to control all the
Haskell threads etc.  But I don't think it waits for FFI-land threads
before commencing a fork(), so that's why I'm guessing that some
interaction between threads using FFI and fork() could be the issue.


- There should be no concern with exec w.r.t. library invariants since
exec is wholesale replacement - all the libraries will reinitialize.
Is there a problem here I'm missing?

 I think that's right. vfork() + exec() can be serialized and deterministic
thus is a lot easier to reason about.

Alexander
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-17 Thread Alexander Kjeldaas
On 17 October 2012 00:17, Mike Meyer m...@mired.org wrote:

 On Tue, 16 Oct 2012 21:55:44 +0200
 Alexander Kjeldaas alexander.kjeld...@gmail.com wrote:

  There are variants of this, but the meta-problem is that at the point in
  time when you call forkProcess, you must control all threads, ensuring
 that
  *all invariants hold*.  Thus no locks held, no thread is in the C
 library,
  no foreign calls active etc.  As an example, if one thread is in the C
  library doing some stdio, then the invariants in that library will not
  hold, and you cannot expect stdio to work in the child.  This means that
  the only thing you can really do in the child process is call exec.

 Further, you can only call exec if you make sure that the exec
 correctly reverts everything back to a state where those invariants
 hold. Mostly, this is automatic as resources get freed on exec and do
 the right thing. Locks on file descriptors that aren't closed on exec
 will leave dangling locks, and locks on files that are closed on exec
 will unexpectedly close them in the parent.


Right.  It should be renamed mostlyUnsafeForkProcess, assuming the
multi-threaded RTS is mostly the default one.

Alexander


mike
 --
 Mike Meyer m...@mired.org  http://www.mired.org/
 Independent Software developer/SCM consultant, email for more information.

 O ascii ribbon campaign - stop html mail - www.asciiribbon.org

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-16 Thread Michael Snoyman
On Mon, Oct 15, 2012 at 6:30 PM, Joey Hess j...@kitenet.net wrote:

 Michael Snoyman wrote:
  I think I have a misunderstanding of how forkProcess should be working.
  Ultimately this relates to some bugs in the development version of
 keter, but
  I've found some behavior in a simple test program which I wouldn't have
  expected either, which may or may not be related.
 
  With the program at the end of this email, I would expect that, once per
  second, I would get a message printed from each forkIO'd green thread,
 the
  forked process, and the master process. And if I spawn 8 or less child
 threads
  that's precisely what happens. However, as soon as I up that number to
 9, the
  child process is never run. The process is, however, created, as can be
  confirmed by looking at the process table.
 
  This only occurs when using the multithreaded runtime. In other words,
  compiling with ghc --make test.hs seems to always produce the expected
  output, whereas ghc --make -threaded test.hs causes the behavior
 described
  above. Having looked through the code for the process package a bit, my
 initial
  guess is that this is being caused by a signal being sent to the child
 process,
  but I'm not familiar enough with the inner workings to confirm or
 disprove this
  guess.
 
  If anyone has any ideas on this, I'd appreciate it.

 While I'm not reproducing that behavior here with your test case and
 7.4.1, I recently converted a large program to use -threaded (because I
 needed to use yesod in it, actually :), and had large quantities of pain
 involving forkProcess. It seemed to come down to this easily overlooked
 note in the docs:

   forkProcess comes with a giant warning: since any other running threads
   are not copied into the child process, it's easy to go wrong: e.g. by
   accessing some shared resource that was held by another thread in the
   parent.

 In my experience, forkProcess often behaves incomprehensibly (to me)
 with -threaded, typically resulting in a forked process hanging, and
 quite often only some small percentage of the time, which makes it
 really hard to track down and try to diagnose what thunk might not be
 getting forced until after the fork, or whatever.

 I did some analysis and produced a test case for problems caused by
 use of forkProcess in parts of MissingH, here:
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=681621

 My understanding is that System.Process avoids these problems by doing
 all the setup around forking a command in C code. I've banished
 forkProcess from my code base entirely, except for a double fork I need
 to daemonize, and I don't even trust that call. :/


Well, I tried switching my code to forking/execing from C in a very similar
manner to the process package, and it seems to work.

Thanks for the input everyone!

Michael
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-16 Thread Alexander Kjeldaas
On 15 October 2012 09:47, Michael Snoyman mich...@snoyman.com wrote:

 Hi all,

 I think I have a misunderstanding of how forkProcess should be working.
 Ultimately this relates to some bugs in the development version of keter,
 but I've found some behavior in a simple test program which I wouldn't have
 expected either, which may or may not be related.

 With the program at the end of this email, I would expect that, once per
 second, I would get a message printed from each forkIO'd green thread, the
 forked process, and the master process. And if I spawn 8 or less child
 threads that's precisely what happens. However, as soon as I up that number
 to 9, the child process is never run. The process is, however, created, as
 can be confirmed by looking at the process table.

 This only occurs when using the multithreaded runtime. In other words,
 compiling with ghc --make test.hs seems to always produce the expected
 output, whereas ghc --make -threaded test.hs causes the behavior
 described above. Having looked through the code for the process package a
 bit, my initial guess is that this is being caused by a signal being sent
 to the child process, but I'm not familiar enough with the inner workings
 to confirm or disprove this guess.

 If anyone has any ideas on this, I'd appreciate it.


Not being familiar with the implementation, I'll just note that combining
fork() with threads, as in the multithreaded RTS is going to have subtle
bugs, no matter what.  The only winning strategy is to not play that game.

There are variants of this, but the meta-problem is that at the point in
time when you call forkProcess, you must control all threads, ensuring that
*all invariants hold*.  Thus no locks held, no thread is in the C library,
no foreign calls active etc.  As an example, if one thread is in the C
library doing some stdio, then the invariants in that library will not
hold, and you cannot expect stdio to work in the child.  This means that
the only thing you can really do in the child process is call exec.

These issues do not exist in the non-threaded world.

Alexander


 Michael

 import System.Posix.Process (forkProcess, getProcessID)
 import Control.Concurrent (forkIO, threadDelay)
 import System.IO (hFlush, stdout)
 import System.Posix.Signals (signalProcess, sigKILL)
 import Control.Exception (finally)

 main :: IO ()
 main = do
 mapM_ spawnChild [1..9]
 child - forkProcess $ do
 putStrLn starting child
 hFlush stdout
 loop child 0
 print (child pid, child)
 hFlush stdout

 -- I've commented out the finally so that the zombie process stays
 alive,
 -- to prove that it was actually created.
 loop parent 0 -- `finally` signalProcess sigKILL child

 spawnChild :: Int - IO ()
 spawnChild i = do
 _ - forkIO $ loop (spawnChild  ++ show i) 0
 return ()

 loop :: String - Int - IO ()
 loop msg i = do
 pid - getProcessID
 print (pid, msg, i)
 hFlush stdout
 threadDelay 100
 loop msg (i + 1)

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-16 Thread Donn Cave
Since we're talking about forkIO here - not forkOS - is it possible
to control the use of OS threads to avoid this problem?  As I
understand it, the problem is with real OS threads.  A program
running entirely in multiple `green' threads will fork to the same
set of threads in the same state, creating no problem with internal
state.

I'm guessing this is why he sees the problem at 9 threads - maybe the
runtime picks up another OS thread at that point.  

Of course one could link the unthreaded RTS, and that will cause
everything, including runtime gc etc., to run in the parent thread,
true?  And there are some runtime options to control number of threads
scheduled.

Donn

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-16 Thread Richard O'Keefe
The problems with forkProcess really are not Haskell's fault.
You will find warnings in the documentation for C's fork():

There are limits to what you can do in the child process.
To be totally safe you should restrict yourself to only
executing async-signal safe operations until such time
as one of the exec functions is called.  All APIs,
including global data symbols, in any framework or
library should be assumed to be unsafe after a fork()
unless explicitly documented to be safe or async-signal safe.

That's actually pretty scary.  I'd always assumed that this was
one of the reasons why the posix_spawn() function and its support
crew were devised.  Which reminds me that I expected to find
posix_spawn() in System.Posix.Process but didn't.

http://www.haskell.org/ghc/docs/7.4-latest/html/libraries/unix-2.5.1.1/System-Posix-Process.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-16 Thread Mike Meyer
On Tue, 16 Oct 2012 21:55:44 +0200
Alexander Kjeldaas alexander.kjeld...@gmail.com wrote:

 There are variants of this, but the meta-problem is that at the point in
 time when you call forkProcess, you must control all threads, ensuring that
 *all invariants hold*.  Thus no locks held, no thread is in the C library,
 no foreign calls active etc.  As an example, if one thread is in the C
 library doing some stdio, then the invariants in that library will not
 hold, and you cannot expect stdio to work in the child.  This means that
 the only thing you can really do in the child process is call exec.

Further, you can only call exec if you make sure that the exec
correctly reverts everything back to a state where those invariants
hold. Mostly, this is automatic as resources get freed on exec and do
the right thing. Locks on file descriptors that aren't closed on exec
will leave dangling locks, and locks on files that are closed on exec
will unexpectedly close them in the parent.

   mike
-- 
Mike Meyer m...@mired.org http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O ascii ribbon campaign - stop html mail - www.asciiribbon.org

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread Michael Snoyman
Hi all,

I think I have a misunderstanding of how forkProcess should be working.
Ultimately this relates to some bugs in the development version of keter,
but I've found some behavior in a simple test program which I wouldn't have
expected either, which may or may not be related.

With the program at the end of this email, I would expect that, once per
second, I would get a message printed from each forkIO'd green thread, the
forked process, and the master process. And if I spawn 8 or less child
threads that's precisely what happens. However, as soon as I up that number
to 9, the child process is never run. The process is, however, created, as
can be confirmed by looking at the process table.

This only occurs when using the multithreaded runtime. In other words,
compiling with ghc --make test.hs seems to always produce the expected
output, whereas ghc --make -threaded test.hs causes the behavior
described above. Having looked through the code for the process package a
bit, my initial guess is that this is being caused by a signal being sent
to the child process, but I'm not familiar enough with the inner workings
to confirm or disprove this guess.

If anyone has any ideas on this, I'd appreciate it.

Michael

import System.Posix.Process (forkProcess, getProcessID)
import Control.Concurrent (forkIO, threadDelay)
import System.IO (hFlush, stdout)
import System.Posix.Signals (signalProcess, sigKILL)
import Control.Exception (finally)

main :: IO ()
main = do
mapM_ spawnChild [1..9]
child - forkProcess $ do
putStrLn starting child
hFlush stdout
loop child 0
print (child pid, child)
hFlush stdout

-- I've commented out the finally so that the zombie process stays
alive,
-- to prove that it was actually created.
loop parent 0 -- `finally` signalProcess sigKILL child

spawnChild :: Int - IO ()
spawnChild i = do
_ - forkIO $ loop (spawnChild  ++ show i) 0
return ()

loop :: String - Int - IO ()
loop msg i = do
pid - getProcessID
print (pid, msg, i)
hFlush stdout
threadDelay 100
loop msg (i + 1)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread Simon Marechal
On 15/10/2012 09:47, Michael Snoyman wrote:
 With the program at the end of this email, I would expect that, once per
 second, I would get a message printed from each forkIO'd green thread,
 the forked process, and the master process. And if I spawn 8 or less
 child threads that's precisely what happens. However, as soon as I up
 that number to 9, the child process is never run. The process is,
 however, created, as can be confirmed by looking at the process table.

FWIW I can reproduce this problem. The strace in both cases is very
different. In the non threaded runtime, everything works as you expect:

7519  select(2, [], [1], NULL, {0, 0})  = 1 (out [1], left {0, 0})
7519  write(1, starting child\n, 15)  = 15
7519  select(2, [], [1], NULL, {0, 0})  = 1 (out [1], left {0, 0})
7519  write(1, (7519,\child\,0)\n, 17) = 17
7519  timer_settime(0x3, 0, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
7519  select(0, [], [], NULL, {0, 94} unfinished ...
7519  ... select resumed )= 0 (Timeout)
7519  timer_settime(0x3, 0, {it_interval={0, 1000}, it_value={0,
1000}},  unfinished ...
7519  ... timer_settime resumed NULL) = 0
7519  select(2, [], [1], NULL, {0, 0} unfinished ...
7519  ... select resumed )= 1 (out [1], left {0, 0})
7519  write(1, (7519,\child\,1)\n, 17 unfinished ...

In the threaded case, it:
* closes FD 3,4,5,6
* epoll_create, pipe, all kind of things
* clones twice
* ???

It seems that forkProcess is implemented entirely differently in the
threaded runtime.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread Alexander V Vershilov
Mon, Oct 15, 2012 at 09:47:35AM +0200, Michael Snoyman wrote
 Hi all,
 
 I think I have a misunderstanding of how forkProcess should be working.
 Ultimately this relates to some bugs in the development version of keter, but
 I've found some behavior in a simple test program which I wouldn't have
 expected either, which may or may not be related.
 
 With the program at the end of this email, I would expect that, once per
 second, I would get a message printed from each forkIO'd green thread, the
 forked process, and the master process. And if I spawn 8 or less child threads
 that's precisely what happens. However, as soon as I up that number to 9, the
 child process is never run. The process is, however, created, as can be
 confirmed by looking at the process table.
 
 This only occurs when using the multithreaded runtime. In other words,
 compiling with ghc --make test.hs seems to always produce the expected
 output, whereas ghc --make -threaded test.hs causes the behavior described
 above. Having looked through the code for the process package a bit, my 
 initial
 guess is that this is being caused by a signal being sent to the child 
 process,
 but I'm not familiar enough with the inner workings to confirm or disprove 
 this
 guess.
 
 If anyone has any ideas on this, I'd appreciate it.
 
 Michael
 
 import System.Posix.Process (forkProcess, getProcessID)
 import Control.Concurrent (forkIO, threadDelay)
 import System.IO (hFlush, stdout)
 import System.Posix.Signals (signalProcess, sigKILL)
 import Control.Exception (finally)
 
 main :: IO ()
 main = do
     mapM_ spawnChild [1..9]
     child - forkProcess $ do
         putStrLn starting child
         hFlush stdout
         loop child 0
     print (child pid, child)
     hFlush stdout
 
     -- I've commented out the finally so that the zombie process stays 
 alive,
     -- to prove that it was actually created.
     loop parent 0 -- `finally` signalProcess sigKILL child
 
 spawnChild :: Int - IO ()
 spawnChild i = do
     _ - forkIO $ loop (spawnChild  ++ show i) 0
     return ()
 
 loop :: String - Int - IO ()
 loop msg i = do
     pid - getProcessID
     print (pid, msg, i)
     hFlush stdout
     threadDelay 100
     loop msg (i + 1)

Sounds like a bug and is not reproducing on ghc-7.6

--
Alexander Vershilov


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread 山本和彦
Michael,

 Having looked through the code for the process package a bit, my
 initial guess is that this is being caused by a signal being sent to the child
 process, but I'm not familiar enough with the inner workings to confirm or
 disprove this guess.

To remove that comment for finally, you need the following:

_ - installHandler sigCHLD Ignore Nothing

Put this line into the beginning of the main. And import necessary
things from System.Posix.Signals.

--Kazu

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread Joey Hess
Michael Snoyman wrote:
 I think I have a misunderstanding of how forkProcess should be working.
 Ultimately this relates to some bugs in the development version of keter, but
 I've found some behavior in a simple test program which I wouldn't have
 expected either, which may or may not be related.
 
 With the program at the end of this email, I would expect that, once per
 second, I would get a message printed from each forkIO'd green thread, the
 forked process, and the master process. And if I spawn 8 or less child threads
 that's precisely what happens. However, as soon as I up that number to 9, the
 child process is never run. The process is, however, created, as can be
 confirmed by looking at the process table.
 
 This only occurs when using the multithreaded runtime. In other words,
 compiling with ghc --make test.hs seems to always produce the expected
 output, whereas ghc --make -threaded test.hs causes the behavior described
 above. Having looked through the code for the process package a bit, my 
 initial
 guess is that this is being caused by a signal being sent to the child 
 process,
 but I'm not familiar enough with the inner workings to confirm or disprove 
 this
 guess.
 
 If anyone has any ideas on this, I'd appreciate it.

While I'm not reproducing that behavior here with your test case and
7.4.1, I recently converted a large program to use -threaded (because I
needed to use yesod in it, actually :), and had large quantities of pain
involving forkProcess. It seemed to come down to this easily overlooked
note in the docs:

  forkProcess comes with a giant warning: since any other running threads
  are not copied into the child process, it's easy to go wrong: e.g. by
  accessing some shared resource that was held by another thread in the
  parent.

In my experience, forkProcess often behaves incomprehensibly (to me)
with -threaded, typically resulting in a forked process hanging, and
quite often only some small percentage of the time, which makes it
really hard to track down and try to diagnose what thunk might not be
getting forced until after the fork, or whatever.

I did some analysis and produced a test case for problems caused by
use of forkProcess in parts of MissingH, here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=681621

My understanding is that System.Process avoids these problems by doing
all the setup around forking a command in C code. I've banished
forkProcess from my code base entirely, except for a double fork I need
to daemonize, and I don't even trust that call. :/

-- 
see shy jo


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread Brandon Allbery
On Mon, Oct 15, 2012 at 12:30 PM, Joey Hess j...@kitenet.net wrote:

   forkProcess comes with a giant warning: since any other running threads
   are not copied into the child process, it's easy to go wrong: e.g. by
   accessing some shared resource that was held by another thread in the
   parent.

 In my experience, forkProcess often behaves incomprehensibly (to me)
 with -threaded, typically resulting in a forked process hanging, and
 quite often only some small percentage of the time, which makes it
 really hard to track down and try to diagnose what thunk might not be
 getting forced until after the fork, or whatever.


The forkProcess MissingH discussion recently left me feeling like this is
going to be triggered in all cases, because it sounds like one of the
problematic threads is the I/O manager's thread?

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix/linux, openafs, kerberos, infrastructure  http://sinenomine.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] forkProcess, forkIO, and multithreaded runtime

2012-10-15 Thread 山本和彦
 My understanding is that System.Process avoids these problems by doing
 all the setup around forking a command in C code. I've banished
 forkProcess from my code base entirely, except for a double fork I need
 to daemonize, and I don't even trust that call. :/

I think you are right. forkProcess is necessary to deamonize a process
but I don't know other usage.

daemonize :: IO () - IO ()
daemonize program = ensureDetachTerminalCanWork $ do
detachTerminal
ensureNeverAttachTerminal $ do
changeWorkingDirectory /
void $ setFileCreationMask 0
mapM_ closeFd [stdInput, stdOutput, stdError]
program
  where
ensureDetachTerminalCanWork p = do
void $ forkProcess p
exitSuccess
ensureNeverAttachTerminal p = do
void $ forkProcess p
exitSuccess
detachTerminal = void createSession

--Kazu


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe