More delving later, it seems the incorrect optimized version has been
turned into:
case (\ _ [Occ=Dead, OS=OneShot] -> error "here") of _ [Occ=Dead] {}
While the working one has been turned into:
errorFunc argument realWorldToken
where errorFunc _ = error "here"
I'm not familiar with case ... of _ {} - what does it mean when there
are no alternatives to the case? And why isn't case on a literal
lambda optimised out? Is it the OneShot annotation (perhaps coming
from the state hack?)
The full trace is at
https://gist.github.com/ndmitchell/b222e04eb0c3a397c758. I've uploaded
bad (optimises the error out) and good (works as expected) versions of
the Core. The summary files are the subexpression that changed making
a single difference (moving a monomorphic NOINLINE function from one
module to another) plus the handful of functions they depend on, which
I've reformatted/inlined/simplified to produce the expressions above.
The full versions are the entire -ddump-simpl output from about
halfway through the build, starting when the differences occur - let
me know if you need further back. The dodgy function is "exec".
Thanks, Neil
On Thu, Mar 19, 2015 at 11:07 PM, Neil Mitchell <[email protected]> wrote:
> Herbert, thanks for the list of patches, nothing obvious there - my
> best guess is it's something incredibly sensitive and it only needs
> the tiniest change anywhere to make it happen. Things like moving
> NOINLINE monomorphic-type definitions from one module to another are
> causing the bug to appear/disappear, which isn't what I'd expect.
>
> Simon, changing from error to error in IO causes the bug to disappear,
> but then so do most things. The error return type is type IO (), so I
> suspect that forces it to be raised at the right place - but it's
> certainly one of the possibilities for what is going wrong. Diffing
> the Core is a great idea.
>
> I'll keep reducing and see what I get to. Given the sensitivity of the
> bug, I'm sure a NOINLINE on an out-of-the-way function will make it go
> away, so I can easily fix Shake itself - so I'm more tracking it down
> from the point of GHC now.
>
> Thanks, Neil
>
>
> On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones
> <[email protected]> wrote:
>> I'm really sorry but I can't think of anything. Sounds horrible.
>>
>> If you throw exceptions using 'error' (not in IO), then you are of course
>> vulnerable to strictness changes. If the thing isn't actually evaluated
>> inside the catch block, you won't see the exception. But I'm sure you've
>> thought of that.
>>
>> I'd experiment with one of the smaller changes you describe, such as adding
>> a putStrLn, and comparing Core, before and after. Switching off -O will
>> make a huge difference, so hard to compare. Turning off the state hack will
>> have a more global effect. But the other changes sound more pin-point and
>> hence the differences will be smaller.
>>
>> Simon
>>
>> | -----Original Message-----
>> | From: ghc-devs [mailto:[email protected]] On Behalf Of Neil
>> | Mitchell
>> | Sent: 18 March 2015 15:33
>> | To: [email protected]
>> | Subject: Shake fails test with GHC 7.10 RC3
>> |
>> | Hi,
>> |
>> | Testing GHC 7.10 RC3 I've found a bug where Shake seems to catch the
>> | wrong exception in the wrong place. It's only hit by one of my tests,
>> | and I've managed to isolate it to a fragment of code with no
>> | unsafePerformIO, that throws exceptions using error (so not in IO), and
>> | operates in IO. Turning off the stack hack makes the bug go away, but
>> | then so does -O0, marking one of the functions it calls NOINLINE, or
>> | moving an INLINE function it calls to a different module, or adding a
>> | putStrLn under a catch block - it's very sensitive to the exact
>> | conditions. This test and this exact code worked fine with GHC 7.10
>> | RC2.
>> |
>> | I was wondering if there have been any state hack related changes or
>> | other potentially dangerous optimisation changes since RC2? I'll
>> | continue to try reducing the bug, but it's somewhat difficult as the
>> | larger system is quite big, and the code is very sensitive.
>> |
>> | Thanks, Neil
>> | _______________________________________________
>> | ghc-devs mailing list
>> | [email protected]
>> | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
On Thu, Mar 19, 2015 at 11:24 PM, Simon Peyton Jones
<[email protected]> wrote:
> Thanks! I think a -ddump-simpl before and after the smallest change the
> makes the difference would be illuminating.
>
> Simon
>
> | -----Original Message-----
> | From: Neil Mitchell [mailto:[email protected]]
> | Sent: 19 March 2015 23:07
> | To: Simon Peyton Jones
> | Cc: [email protected]
> | Subject: Re: Shake fails test with GHC 7.10 RC3
> |
> | Herbert, thanks for the list of patches, nothing obvious there - my
> | best guess is it's something incredibly sensitive and it only needs
> | the tiniest change anywhere to make it happen. Things like moving
> | NOINLINE monomorphic-type definitions from one module to another are
> | causing the bug to appear/disappear, which isn't what I'd expect.
> |
> | Simon, changing from error to error in IO causes the bug to disappear,
> | but then so do most things. The error return type is type IO (), so I
> | suspect that forces it to be raised at the right place - but it's
> | certainly one of the possibilities for what is going wrong. Diffing
> | the Core is a great idea.
> |
> | I'll keep reducing and see what I get to. Given the sensitivity of the
> | bug, I'm sure a NOINLINE on an out-of-the-way function will make it go
> | away, so I can easily fix Shake itself - so I'm more tracking it down
> | from the point of GHC now.
> |
> | Thanks, Neil
> |
> |
> | On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones
> | <[email protected]> wrote:
> | > I'm really sorry but I can't think of anything. Sounds horrible.
> | >
> | > If you throw exceptions using 'error' (not in IO), then you are of
> | course vulnerable to strictness changes. If the thing isn't actually
> | evaluated inside the catch block, you won't see the exception. But I'm
> | sure you've thought of that.
> | >
> | > I'd experiment with one of the smaller changes you describe, such as
> | adding a putStrLn, and comparing Core, before and after. Switching off -
> | O will make a huge difference, so hard to compare. Turning off the state
> | hack will have a more global effect. But the other changes sound more
> | pin-point and hence the differences will be smaller.
> | >
> | > Simon
> | >
> | > | -----Original Message-----
> | > | From: ghc-devs [mailto:[email protected]] On Behalf Of
> | Neil
> | > | Mitchell
> | > | Sent: 18 March 2015 15:33
> | > | To: [email protected]
> | > | Subject: Shake fails test with GHC 7.10 RC3
> | > |
> | > | Hi,
> | > |
> | > | Testing GHC 7.10 RC3 I've found a bug where Shake seems to catch the
> | > | wrong exception in the wrong place. It's only hit by one of my
> | tests,
> | > | and I've managed to isolate it to a fragment of code with no
> | > | unsafePerformIO, that throws exceptions using error (so not in IO),
> | and
> | > | operates in IO. Turning off the stack hack makes the bug go away,
> | but
> | > | then so does -O0, marking one of the functions it calls NOINLINE, or
> | > | moving an INLINE function it calls to a different module, or adding
> | a
> | > | putStrLn under a catch block - it's very sensitive to the exact
> | > | conditions. This test and this exact code worked fine with GHC 7.10
> | > | RC2.
> | > |
> | > | I was wondering if there have been any state hack related changes or
> | > | other potentially dangerous optimisation changes since RC2? I'll
> | > | continue to try reducing the bug, but it's somewhat difficult as the
> | > | larger system is quite big, and the code is very sensitive.
> | > |
> | > | Thanks, Neil
> | > | _______________________________________________
> | > | ghc-devs mailing list
> | > | [email protected]
> | > | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs