Since I won't have access to my development computer during work hours tomorrow, I generated complete Core for both versions and put them in http://community.haskell.org/~ndm/temp/core.zip.
Thanks, Neil On Fri, Mar 20, 2015 at 12:01 AM, Neil Mitchell <[email protected]> wrote: > More delving later, it seems the incorrect optimized version has been > turned into: > > case (\ _ [Occ=Dead, OS=OneShot] -> error "here") of _ [Occ=Dead] {} > > While the working one has been turned into: > > errorFunc argument realWorldToken > > where errorFunc _ = error "here" > > I'm not familiar with case ... of _ {} - what does it mean when there > are no alternatives to the case? And why isn't case on a literal > lambda optimised out? Is it the OneShot annotation (perhaps coming > from the state hack?) > > The full trace is at > https://gist.github.com/ndmitchell/b222e04eb0c3a397c758. I've uploaded > bad (optimises the error out) and good (works as expected) versions of > the Core. The summary files are the subexpression that changed making > a single difference (moving a monomorphic NOINLINE function from one > module to another) plus the handful of functions they depend on, which > I've reformatted/inlined/simplified to produce the expressions above. > The full versions are the entire -ddump-simpl output from about > halfway through the build, starting when the differences occur - let > me know if you need further back. The dodgy function is "exec". > > Thanks, Neil > > On Thu, Mar 19, 2015 at 11:07 PM, Neil Mitchell <[email protected]> wrote: >> Herbert, thanks for the list of patches, nothing obvious there - my >> best guess is it's something incredibly sensitive and it only needs >> the tiniest change anywhere to make it happen. Things like moving >> NOINLINE monomorphic-type definitions from one module to another are >> causing the bug to appear/disappear, which isn't what I'd expect. >> >> Simon, changing from error to error in IO causes the bug to disappear, >> but then so do most things. The error return type is type IO (), so I >> suspect that forces it to be raised at the right place - but it's >> certainly one of the possibilities for what is going wrong. Diffing >> the Core is a great idea. >> >> I'll keep reducing and see what I get to. Given the sensitivity of the >> bug, I'm sure a NOINLINE on an out-of-the-way function will make it go >> away, so I can easily fix Shake itself - so I'm more tracking it down >> from the point of GHC now. >> >> Thanks, Neil >> >> >> On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones >> <[email protected]> wrote: >>> I'm really sorry but I can't think of anything. Sounds horrible. >>> >>> If you throw exceptions using 'error' (not in IO), then you are of course >>> vulnerable to strictness changes. If the thing isn't actually evaluated >>> inside the catch block, you won't see the exception. But I'm sure you've >>> thought of that. >>> >>> I'd experiment with one of the smaller changes you describe, such as adding >>> a putStrLn, and comparing Core, before and after. Switching off -O will >>> make a huge difference, so hard to compare. Turning off the state hack >>> will have a more global effect. But the other changes sound more pin-point >>> and hence the differences will be smaller. >>> >>> Simon >>> >>> | -----Original Message----- >>> | From: ghc-devs [mailto:[email protected]] On Behalf Of Neil >>> | Mitchell >>> | Sent: 18 March 2015 15:33 >>> | To: [email protected] >>> | Subject: Shake fails test with GHC 7.10 RC3 >>> | >>> | Hi, >>> | >>> | Testing GHC 7.10 RC3 I've found a bug where Shake seems to catch the >>> | wrong exception in the wrong place. It's only hit by one of my tests, >>> | and I've managed to isolate it to a fragment of code with no >>> | unsafePerformIO, that throws exceptions using error (so not in IO), and >>> | operates in IO. Turning off the stack hack makes the bug go away, but >>> | then so does -O0, marking one of the functions it calls NOINLINE, or >>> | moving an INLINE function it calls to a different module, or adding a >>> | putStrLn under a catch block - it's very sensitive to the exact >>> | conditions. This test and this exact code worked fine with GHC 7.10 >>> | RC2. >>> | >>> | I was wondering if there have been any state hack related changes or >>> | other potentially dangerous optimisation changes since RC2? I'll >>> | continue to try reducing the bug, but it's somewhat difficult as the >>> | larger system is quite big, and the code is very sensitive. >>> | >>> | Thanks, Neil >>> | _______________________________________________ >>> | ghc-devs mailing list >>> | [email protected] >>> | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > On Thu, Mar 19, 2015 at 11:24 PM, Simon Peyton Jones > <[email protected]> wrote: >> Thanks! I think a -ddump-simpl before and after the smallest change the >> makes the difference would be illuminating. >> >> Simon >> >> | -----Original Message----- >> | From: Neil Mitchell [mailto:[email protected]] >> | Sent: 19 March 2015 23:07 >> | To: Simon Peyton Jones >> | Cc: [email protected] >> | Subject: Re: Shake fails test with GHC 7.10 RC3 >> | >> | Herbert, thanks for the list of patches, nothing obvious there - my >> | best guess is it's something incredibly sensitive and it only needs >> | the tiniest change anywhere to make it happen. Things like moving >> | NOINLINE monomorphic-type definitions from one module to another are >> | causing the bug to appear/disappear, which isn't what I'd expect. >> | >> | Simon, changing from error to error in IO causes the bug to disappear, >> | but then so do most things. The error return type is type IO (), so I >> | suspect that forces it to be raised at the right place - but it's >> | certainly one of the possibilities for what is going wrong. Diffing >> | the Core is a great idea. >> | >> | I'll keep reducing and see what I get to. Given the sensitivity of the >> | bug, I'm sure a NOINLINE on an out-of-the-way function will make it go >> | away, so I can easily fix Shake itself - so I'm more tracking it down >> | from the point of GHC now. >> | >> | Thanks, Neil >> | >> | >> | On Wed, Mar 18, 2015 at 5:04 PM, Simon Peyton Jones >> | <[email protected]> wrote: >> | > I'm really sorry but I can't think of anything. Sounds horrible. >> | > >> | > If you throw exceptions using 'error' (not in IO), then you are of >> | course vulnerable to strictness changes. If the thing isn't actually >> | evaluated inside the catch block, you won't see the exception. But I'm >> | sure you've thought of that. >> | > >> | > I'd experiment with one of the smaller changes you describe, such as >> | adding a putStrLn, and comparing Core, before and after. Switching off - >> | O will make a huge difference, so hard to compare. Turning off the state >> | hack will have a more global effect. But the other changes sound more >> | pin-point and hence the differences will be smaller. >> | > >> | > Simon >> | > >> | > | -----Original Message----- >> | > | From: ghc-devs [mailto:[email protected]] On Behalf Of >> | Neil >> | > | Mitchell >> | > | Sent: 18 March 2015 15:33 >> | > | To: [email protected] >> | > | Subject: Shake fails test with GHC 7.10 RC3 >> | > | >> | > | Hi, >> | > | >> | > | Testing GHC 7.10 RC3 I've found a bug where Shake seems to catch the >> | > | wrong exception in the wrong place. It's only hit by one of my >> | tests, >> | > | and I've managed to isolate it to a fragment of code with no >> | > | unsafePerformIO, that throws exceptions using error (so not in IO), >> | and >> | > | operates in IO. Turning off the stack hack makes the bug go away, >> | but >> | > | then so does -O0, marking one of the functions it calls NOINLINE, or >> | > | moving an INLINE function it calls to a different module, or adding >> | a >> | > | putStrLn under a catch block - it's very sensitive to the exact >> | > | conditions. This test and this exact code worked fine with GHC 7.10 >> | > | RC2. >> | > | >> | > | I was wondering if there have been any state hack related changes or >> | > | other potentially dangerous optimisation changes since RC2? I'll >> | > | continue to try reducing the bug, but it's somewhat difficult as the >> | > | larger system is quite big, and the code is very sensitive. >> | > | >> | > | Thanks, Neil >> | > | _______________________________________________ >> | > | ghc-devs mailing list >> | > | [email protected] >> | > | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
