Thread behavior in 7.8.3

2014-10-29 Thread Michael Jones
I have a general question about thread behavior in 7.8.3 vs 7.6.X

I moved from 7.6 to 7.8 and my application behaves very differently. I have 
three threads, an application thread that plots data with wxhaskell or sends it 
over a network (depends on settings), a thread doing usb bulk writes, and a 
thread doing usb bulk reads. Data is moved around with TChan, and TVar is used 
for coordination.

When the application was compiled with 7.6, my stream of usb traffic was 
smooth. With 7.8, there are lots of delays where nothing seems to be running. 
These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so.

When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine 
without with -N2/4.

The program is compiled -O2 with profiling. The -N2/4 version uses more memory, 
 but in both cases with 7.8 and with 7.6 there is no space leak.

I tired to compile and use -ls so I could take a look with threadscope, but the 
application hangs and writes no data to the file. The CPU fans run wild like it 
is in an infinite loop. It at least pops an unpainted wxhaskell window, so it 
got partially running.

One of my libraries uses option -fsimpl-tick-factor=200 to get around the 
compiler.

What do I need to know about changes to threading and event logging between 7.6 
and 7.8? Is there some general documentation somewhere that might help?

I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and 
installed myself, after removing 7.6 with apt-get.

Any hints appreciated.

Mike


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Irreducible predicates error in Template Haskell

2014-10-29 Thread Sreenidhi Nair
Hello,

we were trying to reify a typeclass, which had a ConstraintKind and we hit
upon this error: Can't represent irreducible predicates in Template
Haskell:.

It seems that there is already a ghc bug [
https://ghc.haskell.org/trac/ghc/ticket/7021 ] filed and its status is set
as fixed, but there is a comment at the bottom in which the reviewer
recommends against merging immediately. Does anybody know when it would get
merged in?

-- 
Yours truly,
Sreenidhi Nair
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Irreducible predicates error in Template Haskell

2014-10-29 Thread Richard Eisenberg
This fix will not get merged into the 7.8.x development stream, but it is 
already available in HEAD and will be available in GHC 7.10.x. We try not to 
make breaking changes (and this is a breaking change) in the middle of a major 
version.

Richard

On Oct 29, 2014, at 11:27 AM, Sreenidhi Nair nair.sreeni...@gmail.com wrote:

 Hello,
 
 we were trying to reify a typeclass, which had a ConstraintKind and we hit 
 upon this error: Can't represent irreducible predicates in Template 
 Haskell:.
 
 It seems that there is already a ghc bug [ 
 https://ghc.haskell.org/trac/ghc/ticket/7021 ] filed and its status is set as 
 fixed, but there is a comment at the bottom in which the reviewer recommends 
 against merging immediately. Does anybody know when it would get merged in? 
 
 -- 
 Yours truly,
 Sreenidhi Nair
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Ben Gamari
Michael Jones m...@proclivis.com writes:

 I have a general question about thread behavior in 7.8.3 vs 7.6.X

 I moved from 7.6 to 7.8 and my application behaves very differently. I
 have three threads, an application thread that plots data with
 wxhaskell or sends it over a network (depends on settings), a thread
 doing usb bulk writes, and a thread doing usb bulk reads. Data is
 moved around with TChan, and TVar is used for coordination.

Are you using Bas van Dijk's `usb` library by any chance? If so, you
should be aware of this [1] issue.

 When the application was compiled with 7.6, my stream of usb traffic
 was smooth. With 7.8, there are lots of delays where nothing seems to
 be running. These delays are up to 40ms, whereas with 7.6 delays were
 a 1ms or so.

 When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
 fine without with -N2/4.

 The program is compiled -O2 with profiling. The -N2/4 version uses
 more memory, but in both cases with 7.8 and with 7.6 there is no space
 leak.

Have you looked at the RTS's output when run with `+RTS -sstderr`?
Is productivity any different between the two tests?

 I tired to compile and use -ls so I could take a look with
 threadscope, but the application hangs and writes no data to the file.
 The CPU fans run wild like it is in an infinite loop.

Oh dear, this doesn't sound good at all. Have you tried getting a
backtrace out of gdb? Usually this isn't terribly useful but in this
case since the event log is involved it might be getting stuck in the RTS
which should give a useful backtrace. If not, perhaps strace will give
some clues as to what is happening (you'll probably want to hide
SIGVTALM to improve signal/noise)?

Cheers,

- Ben


[1] https://github.com/basvandijk/usb/issues/7


pgpGwIoqAmMbI.pgp
Description: PGP signature
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Michael Jones
Ben,

I am using Bas van Dijk’s usb, and I am past the -threading issue by using the 
latest commit.

I don’t have any easy way of making comparisons between 7.6 and 7.8 
productivity, but from oscilloscope activity, I can’t see any difference. The 
only difference I see is the thread scheduling on 7.8 for -N1 vs -N2/4.

If —sstderr gives some notion of productivity, I’ll have to do an experiment 
between -N1 and -N2/4. Unchartered territory for me. I’ll setup and experiment 
tonight.

I am not familiar with strace. I’ll fix that soon.

Mike

On Oct 29, 2014, at 10:24 AM, Ben Gamari bgamari.f...@gmail.com wrote:

 Michael Jones m...@proclivis.com writes:
 
 I have a general question about thread behavior in 7.8.3 vs 7.6.X
 
 I moved from 7.6 to 7.8 and my application behaves very differently. I
 have three threads, an application thread that plots data with
 wxhaskell or sends it over a network (depends on settings), a thread
 doing usb bulk writes, and a thread doing usb bulk reads. Data is
 moved around with TChan, and TVar is used for coordination.
 
 Are you using Bas van Dijk's `usb` library by any chance? If so, you
 should be aware of this [1] issue.
 
 When the application was compiled with 7.6, my stream of usb traffic
 was smooth. With 7.8, there are lots of delays where nothing seems to
 be running. These delays are up to 40ms, whereas with 7.6 delays were
 a 1ms or so.
 
 When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
 fine without with -N2/4.
 
 The program is compiled -O2 with profiling. The -N2/4 version uses
 more memory, but in both cases with 7.8 and with 7.6 there is no space
 leak.
 
 Have you looked at the RTS's output when run with `+RTS -sstderr`?
 Is productivity any different between the two tests?
 
 I tired to compile and use -ls so I could take a look with
 threadscope, but the application hangs and writes no data to the file.
 The CPU fans run wild like it is in an infinite loop.
 
 Oh dear, this doesn't sound good at all. Have you tried getting a
 backtrace out of gdb? Usually this isn't terribly useful but in this
 case since the event log is involved it might be getting stuck in the RTS
 which should give a useful backtrace. If not, perhaps strace will give
 some clues as to what is happening (you'll probably want to hide
 SIGVTALM to improve signal/noise)?
 
 Cheers,
 
 - Ben
 
 
 [1] https://github.com/basvandijk/usb/issues/7


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Recursion on TypeNats

2014-10-29 Thread Richard Eisenberg
I don't think we'll need notation to differentiate: just use overloaded 
literals, like we do in terms. Something that would operate vaguely like this:

 type family 3 :: k where
   3 :: Nat = ... -- 3 as a Nat
   3 :: Integer = ... -- 3 as an Integer

I'm not at all suggesting it be implemented this way, but we already have the 
ability to branch in type families based on result kind, so the mechanism is 
already around. Unfortunately, this would be unhelpful if the user asked for (3 
:: Bool), which would kind-check but be stuck.

Richard

On Oct 28, 2014, at 8:24 PM, Iavor Diatchki iavor.diatc...@gmail.com wrote:

 Hello,
 
 actually type-level integers are easier to work with than type-level naturals 
 (e.g., one can cancel things by subtracting at will).   I agree that ideally 
 we want to have both integers and naturals (probably as separate kinds).  I 
 just don't know what notation to use to distinguish the two. 
 
 -Iavor
 
 
 
 On Mon, Oct 27, 2014 at 2:13 PM, Barney Hilken b.hil...@ntlworld.com wrote:
 Ok, I've created a ticket https://ghc.haskell.org/trac/ghc/ticket/9731
 
 Unfortunately I don't know enough about ghc internals to try implementing it.
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Ben Gamari
Michael Jones m...@proclivis.com writes:

 Ben,

 I am using Bas van Dijk’s usb, and I am past the -threading issue by
 using the latest commit.

Excellent; I hadn't noticed the proclivis in your email address

 I don’t have any easy way of making comparisons between 7.6 and 7.8
 productivity, but from oscilloscope activity, I can’t see any
 difference. The only difference I see is the thread scheduling on 7.8
 for -N1 vs -N2/4.

 If —sstderr gives some notion of productivity, I’ll have to do an
 experiment between -N1 and -N2/4. Unchartered territory for me. I’ll
 setup and experiment tonight.

Indeed it does; productivity in this context refers to the fraction of
runtime spent in evaluation (as opposed to in the garbage collector, for
instance).

 I am not familiar with strace. I’ll fix that soon.

It's often an invaluable tool; that being said it remains to seen
whether it yields anything useful in this particular case.

Cheers,

- Ben


pgp2rrHbAUnFW.pgp
Description: PGP signature
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread John Lato
By any chance do the delays get shorter if you run your program with `+RTS
-C0.005` ?  If so, I suspect you're having a problem very similar to one
that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
reason), involving possible misbehavior of the thread scheduler.

On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote:

 I have a general question about thread behavior in 7.8.3 vs 7.6.X

 I moved from 7.6 to 7.8 and my application behaves very differently. I
 have three threads, an application thread that plots data with wxhaskell or
 sends it over a network (depends on settings), a thread doing usb bulk
 writes, and a thread doing usb bulk reads. Data is moved around with TChan,
 and TVar is used for coordination.

 When the application was compiled with 7.6, my stream of usb traffic was
 smooth. With 7.8, there are lots of delays where nothing seems to be
 running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or
 so.

 When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine
 without with -N2/4.

 The program is compiled -O2 with profiling. The -N2/4 version uses more
 memory,  but in both cases with 7.8 and with 7.6 there is no space leak.

 I tired to compile and use -ls so I could take a look with threadscope,
 but the application hangs and writes no data to the file. The CPU fans run
 wild like it is in an infinite loop. It at least pops an unpainted
 wxhaskell window, so it got partially running.

 One of my libraries uses option -fsimpl-tick-factor=200 to get around the
 compiler.

 What do I need to know about changes to threading and event logging
 between 7.6 and 7.8? Is there some general documentation somewhere that
 might help?

 I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
 installed myself, after removing 7.6 with apt-get.

 Any hints appreciated.

 Mike


 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Michael Jones
John,

Adding -C0.005 makes it much better. Using -C0.001 makes it behave more like 
-N4.

Thanks. This saves my project, as I need to deploy on a single core Atom and 
was stuck.

Mike

On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote:

 By any chance do the delays get shorter if you run your program with `+RTS 
 -C0.005` ?  If so, I suspect you're having a problem very similar to one that 
 we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some reason), 
 involving possible misbehavior of the thread scheduler.
 
 On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote:
 I have a general question about thread behavior in 7.8.3 vs 7.6.X
 
 I moved from 7.6 to 7.8 and my application behaves very differently. I have 
 three threads, an application thread that plots data with wxhaskell or sends 
 it over a network (depends on settings), a thread doing usb bulk writes, and 
 a thread doing usb bulk reads. Data is moved around with TChan, and TVar is 
 used for coordination.
 
 When the application was compiled with 7.6, my stream of usb traffic was 
 smooth. With 7.8, there are lots of delays where nothing seems to be running. 
 These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so.
 
 When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine 
 without with -N2/4.
 
 The program is compiled -O2 with profiling. The -N2/4 version uses more 
 memory,  but in both cases with 7.8 and with 7.6 there is no space leak.
 
 I tired to compile and use -ls so I could take a look with threadscope, but 
 the application hangs and writes no data to the file. The CPU fans run wild 
 like it is in an infinite loop. It at least pops an unpainted wxhaskell 
 window, so it got partially running.
 
 One of my libraries uses option -fsimpl-tick-factor=200 to get around the 
 compiler.
 
 What do I need to know about changes to threading and event logging between 
 7.6 and 7.8? Is there some general documentation somewhere that might help?
 
 I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and 
 installed myself, after removing 7.6 with apt-get.
 
 Any hints appreciated.
 
 Mike
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread John Lato
I guess I should explain what that flag does...

The GHC RTS maintains capabilities, the number of capabilities is specified
by the `+RTS -N` option.  Each capability is a virtual machine that
executes Haskell code, and maintains its own runqueue of threads to process.

A capability will perform a context switch at the next heap block
allocation (every 4k of allocation) after the timer expires.  The timer
defaults to 20ms, and can be set by the -C flag.  Capabilities perform
context switches in other circumstances as well, such as when a thread
yields or blocks.

My guess is that either the context switching logic changed in ghc-7.8, or
possibly your code used to trigger a switch via some other mechanism (stack
overflow or something maybe?), but is optimized differently now so instead
it needs to wait for the timer to expire.

The problem we had was that a time-sensitive thread was getting scheduled
on the same capability as a long-running non-yielding thread, so the
time-sensitive thread had to wait for a context switch timeout (even though
there were free cores available!).  I expect even with -N4 you'll still see
occasional delays (perhaps 5% of calls).

We've solved our problem with judicious use of `forkOn`, but that won't
help at N1.

We did see this behavior in 7.6, but it's definitely worse in 7.8.

Incidentally, has there been any interest in a work-stealing scheduler?
There was a discussion from about 2 years ago, in which Simon Marlow noted
it might be tricky, but it would definitely help in situations like this.

John L.

On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones m...@proclivis.com wrote:

 John,

 Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
 like -N4.

 Thanks. This saves my project, as I need to deploy on a single core Atom
 and was stuck.

 Mike

 On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote:

 By any chance do the delays get shorter if you run your program with `+RTS
 -C0.005` ?  If so, I suspect you're having a problem very similar to one
 that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
 reason), involving possible misbehavior of the thread scheduler.

 On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote:

 I have a general question about thread behavior in 7.8.3 vs 7.6.X

 I moved from 7.6 to 7.8 and my application behaves very differently. I
 have three threads, an application thread that plots data with wxhaskell or
 sends it over a network (depends on settings), a thread doing usb bulk
 writes, and a thread doing usb bulk reads. Data is moved around with TChan,
 and TVar is used for coordination.

 When the application was compiled with 7.6, my stream of usb traffic was
 smooth. With 7.8, there are lots of delays where nothing seems to be
 running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or
 so.

 When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine
 without with -N2/4.

 The program is compiled -O2 with profiling. The -N2/4 version uses more
 memory,  but in both cases with 7.8 and with 7.6 there is no space leak.

 I tired to compile and use -ls so I could take a look with threadscope,
 but the application hangs and writes no data to the file. The CPU fans run
 wild like it is in an infinite loop. It at least pops an unpainted
 wxhaskell window, so it got partially running.

 One of my libraries uses option -fsimpl-tick-factor=200 to get around the
 compiler.

 What do I need to know about changes to threading and event logging
 between 7.6 and 7.8? Is there some general documentation somewhere that
 might help?

 I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
 installed myself, after removing 7.6 with apt-get.

 Any hints appreciated.

 Mike


 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Edward Z. Yang
I don't think this is directly related to the problem, but if you have a
thread that isn't yielding, you can force it to yield by using
-fno-omit-yields on your code.  It won't help if the non-yielding code
is in a library, and it won't help if the problem was that you just
weren't setting timeouts finely enough (which sounds like what was
happening). FYI.

Edward

Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700:
 I guess I should explain what that flag does...
 
 The GHC RTS maintains capabilities, the number of capabilities is specified
 by the `+RTS -N` option.  Each capability is a virtual machine that
 executes Haskell code, and maintains its own runqueue of threads to process.
 
 A capability will perform a context switch at the next heap block
 allocation (every 4k of allocation) after the timer expires.  The timer
 defaults to 20ms, and can be set by the -C flag.  Capabilities perform
 context switches in other circumstances as well, such as when a thread
 yields or blocks.
 
 My guess is that either the context switching logic changed in ghc-7.8, or
 possibly your code used to trigger a switch via some other mechanism (stack
 overflow or something maybe?), but is optimized differently now so instead
 it needs to wait for the timer to expire.
 
 The problem we had was that a time-sensitive thread was getting scheduled
 on the same capability as a long-running non-yielding thread, so the
 time-sensitive thread had to wait for a context switch timeout (even though
 there were free cores available!).  I expect even with -N4 you'll still see
 occasional delays (perhaps 5% of calls).
 
 We've solved our problem with judicious use of `forkOn`, but that won't
 help at N1.
 
 We did see this behavior in 7.6, but it's definitely worse in 7.8.
 
 Incidentally, has there been any interest in a work-stealing scheduler?
 There was a discussion from about 2 years ago, in which Simon Marlow noted
 it might be tricky, but it would definitely help in situations like this.
 
 John L.
 
 On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones m...@proclivis.com wrote:
 
  John,
 
  Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
  like -N4.
 
  Thanks. This saves my project, as I need to deploy on a single core Atom
  and was stuck.
 
  Mike
 
  On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote:
 
  By any chance do the delays get shorter if you run your program with `+RTS
  -C0.005` ?  If so, I suspect you're having a problem very similar to one
  that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
  reason), involving possible misbehavior of the thread scheduler.
 
  On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote:
 
  I have a general question about thread behavior in 7.8.3 vs 7.6.X
 
  I moved from 7.6 to 7.8 and my application behaves very differently. I
  have three threads, an application thread that plots data with wxhaskell or
  sends it over a network (depends on settings), a thread doing usb bulk
  writes, and a thread doing usb bulk reads. Data is moved around with TChan,
  and TVar is used for coordination.
 
  When the application was compiled with 7.6, my stream of usb traffic was
  smooth. With 7.8, there are lots of delays where nothing seems to be
  running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or
  so.
 
  When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine
  without with -N2/4.
 
  The program is compiled -O2 with profiling. The -N2/4 version uses more
  memory,  but in both cases with 7.8 and with 7.6 there is no space leak.
 
  I tired to compile and use -ls so I could take a look with threadscope,
  but the application hangs and writes no data to the file. The CPU fans run
  wild like it is in an infinite loop. It at least pops an unpainted
  wxhaskell window, so it got partially running.
 
  One of my libraries uses option -fsimpl-tick-factor=200 to get around the
  compiler.
 
  What do I need to know about changes to threading and event logging
  between 7.6 and 7.8? Is there some general documentation somewhere that
  might help?
 
  I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
  installed myself, after removing 7.6 with apt-get.
 
  Any hints appreciated.
 
  Mike
 
 
  ___
  Glasgow-haskell-users mailing list
  Glasgow-haskell-users@haskell.org
  http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 
 
 
 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Thread behavior in 7.8.3

2014-10-29 Thread Edward Z. Yang
Yes, that's right.

I brought it up because you mentioned that there might still be
occasional delays, and those might be caused by a thread not being
preemptible for a while.

Edward

Excerpts from John Lato's message of 2014-10-29 17:31:45 -0700:
 My understanding is that -fno-omit-yields is subtly different.  I think
 that's for the case when a function loops without performing any heap
 allocations, and thus would never yield even after the context switch
 timeout.  In my case the looping function does perform heap allocations and
 does eventually yield, just not until after the timeout.
 
 Is that understanding correct?
 
 (technically, doesn't it change to yielding after stack checks or something
 like that?)
 
 On Thu, Oct 30, 2014 at 8:24 AM, Edward Z. Yang ezy...@mit.edu wrote:
 
  I don't think this is directly related to the problem, but if you have a
  thread that isn't yielding, you can force it to yield by using
  -fno-omit-yields on your code.  It won't help if the non-yielding code
  is in a library, and it won't help if the problem was that you just
  weren't setting timeouts finely enough (which sounds like what was
  happening). FYI.
 
  Edward
 
  Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700:
   I guess I should explain what that flag does...
  
   The GHC RTS maintains capabilities, the number of capabilities is
  specified
   by the `+RTS -N` option.  Each capability is a virtual machine that
   executes Haskell code, and maintains its own runqueue of threads to
  process.
  
   A capability will perform a context switch at the next heap block
   allocation (every 4k of allocation) after the timer expires.  The timer
   defaults to 20ms, and can be set by the -C flag.  Capabilities perform
   context switches in other circumstances as well, such as when a thread
   yields or blocks.
  
   My guess is that either the context switching logic changed in ghc-7.8,
  or
   possibly your code used to trigger a switch via some other mechanism
  (stack
   overflow or something maybe?), but is optimized differently now so
  instead
   it needs to wait for the timer to expire.
  
   The problem we had was that a time-sensitive thread was getting scheduled
   on the same capability as a long-running non-yielding thread, so the
   time-sensitive thread had to wait for a context switch timeout (even
  though
   there were free cores available!).  I expect even with -N4 you'll still
  see
   occasional delays (perhaps 5% of calls).
  
   We've solved our problem with judicious use of `forkOn`, but that won't
   help at N1.
  
   We did see this behavior in 7.6, but it's definitely worse in 7.8.
  
   Incidentally, has there been any interest in a work-stealing scheduler?
   There was a discussion from about 2 years ago, in which Simon Marlow
  noted
   it might be tricky, but it would definitely help in situations like this.
  
   John L.
  
   On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones m...@proclivis.com
  wrote:
  
John,
   
Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
like -N4.
   
Thanks. This saves my project, as I need to deploy on a single core
  Atom
and was stuck.
   
Mike
   
On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote:
   
By any chance do the delays get shorter if you run your program with
  `+RTS
-C0.005` ?  If so, I suspect you're having a problem very similar to
  one
that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
reason), involving possible misbehavior of the thread scheduler.
   
On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com
  wrote:
   
I have a general question about thread behavior in 7.8.3 vs 7.6.X
   
I moved from 7.6 to 7.8 and my application behaves very differently. I
have three threads, an application thread that plots data with
  wxhaskell or
sends it over a network (depends on settings), a thread doing usb bulk
writes, and a thread doing usb bulk reads. Data is moved around with
  TChan,
and TVar is used for coordination.
   
When the application was compiled with 7.6, my stream of usb traffic
  was
smooth. With 7.8, there are lots of delays where nothing seems to be
running. These delays are up to 40ms, whereas with 7.6 delays were a
  1ms or
so.
   
When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
  fine
without with -N2/4.
   
The program is compiled -O2 with profiling. The -N2/4 version uses
  more
memory,  but in both cases with 7.8 and with 7.6 there is no space
  leak.
   
I tired to compile and use -ls so I could take a look with
  threadscope,
but the application hangs and writes no data to the file. The CPU
  fans run
wild like it is in an infinite loop. It at least pops an unpainted
wxhaskell window, so it got partially running.
   
One of my libraries uses option -fsimpl-tick-factor=200 to get around
  the