Re: Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat, 2009-05-23 at 20:42 -0400, Mario Blažević wrote: On Sat 23/05/09 2:51 PM , Duncan Coutts duncan.cou...@worc.ox.ac.uk sent: On Sat, 2009-05-23 at 13:31 -0400, Mario Blažević wrote: ... So the function is not strict, and I don't understand why GHC should evaluate the arguments before the call. Right, it's lazy in the first and strict in the second argument. As far as I can see we have no evidence that is is evaluating anything before the call. When I look at the Core definition of `test', it begins with \ (n1axl::integer:GHCziIntegerziInternals.Integer) (n2axn::integer:GHCziIntegerziInternals.Integer) - %let as1sU :: integer:GHCziIntegerziInternals.Integer = base:DataziList.prod1 (main:Main.factors2 (base:DataziList.prod1 (base:GHCziNum.upzulist main:Main.lvl main:Main.lvl n1axl) base:DataziList.lvl1)) base:DataziList.lvl1 %in %case integer:GHCziIntegerziInternals.Integer (ghczmprim:GHCziPrim.parzh @ integer:GHCziIntegerziInternals.Integer as1sU) %of (dsapq::ghczmprim:GHCziPrim.Intzh) I recommend using -ddump-simpl, as it produces more readable output. To my untrained eyes, this looks like it's evaluating product $ factors $ product [1..n1]) which is the first argument to `parallelize'. That core code is doing: let blah = product $ factors $ product thelist in case par# blah of _ - ... So it's calling the primitive par# but nothing is forced to WHNF (except the unused dummy return value of par#). I assume that %case in Core evaluates the argument to WHNF, just like case in Haskell. Yes. (In fact case in core always reduces to WHNF, where as in Haskell case (...) of bar - (...) does not force anything.) Then again, I could be completely misinterpreting what Core is, because I can't find any call to `parallelize' before or after that. It appears to be inlined in Core, regardless of whether the pragma {-# INLINE parallelize #-} is there or not. Yes, because even without the pragma, ghc decides it's profitable to inline. Actually, I can't see any effect of that pragma in the core files whatsoever, but it certainly has effect on run time. How about diffing the whole core output (and using -ddump-simpl). If there's a performance difference then there must be a difference in the core code too. Or do you mean to say that *your* installation of GHC behaves the same when the function `parallelize' is defined in the same module and when it's imported? Yes, exactly. With both ghc-6.10.1 and a very recent build of ghc-6.11 Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
I recommend using -ddump-simpl, as it produces more readable output. Actually, I can't see any effect of that pragma in the core files whatsoever, but it certainly has effect on run time. How about diffing the whole core output (and using -ddump-simpl). If there's a performance difference then there must be a difference in the core code too. Ok. $ ghc-6.11.20090421 --make primes-test.hs -threaded -O2 -ddump-simpl main.simpl $ time ./primes-test +RTS -N2 4001 real0m9.636s user0m18.201s sys 0m0.088s $ ghc-6.11.20090421 --make primes-test.hs -threaded -O2 -ddump-simpl imported.simpl $ time ./primes-test +RTS -N24001 real0m17.547s user0m17.453s sys 0m0.052s I can't exactly use diff because the generated identifier names are not the same, but after poring over with Emacs ediff I have found only one difference that's not attributable to identifiers: $diff main.simpl imported.simpl ... 223c232 a_s1rs [ALWAYS Just L] :: GHC.Integer.Internals.Integer --- a_s1sV [ALWAYS Just S] :: GHC.Integer.Internals.Integer ... Does this S vs. L difference have anything to do with strictness and laziness? That line is a part of the `Main.test' definition: $ diff -C 3 main.simpl imported.simpl *** 217,244 [Arity 2 Str: DmdType LL] Main.test = ! \ (n1_ahR :: GHC.Integer.Internals.Integer) ! (n2_ahT :: GHC.Integer.Internals.Integer) - let { ! a_s1rs [ALWAYS Just L] :: GHC.Integer.Internals.Integer LclId [Str: DmdType] ! a_s1rs = Data.List.prod1 (Main.factors2 (Data.List.prod1 ! (GHC.Num.up_list Main.lvl Main.lvl n1_ahR) Data.List.lvl1)) Data.List.lvl1 } in ! case GHC.Prim.par# @ GHC.Integer.Internals.Integer a_s1rs of _ { __DEFAULT - case Data.List.prod1 (Main.factors2 (Data.List.prod1 ! (GHC.Num.up_list Main.lvl Main.lvl n2_ahT) Data.List.lvl1)) Data.List.lvl1 ! of x1_aUS { __DEFAULT - ! case GHC.Real.$wdivMod x1_aUS a_s1rs of _ { (# ww1_aUo, _ #) - ! ww1_aUo } } } Or do you mean to say that *your* installation of GHC behaves the same when the function `parallelize' is defined in the same module and when it's imported? Yes, exactly. With both ghc-6.10.1 and a very recent build of ghc-6.11 As you can see, I'm using the 6.11.20090421 build. I looked at recent ones, but the Linux builds seem to be less stable in May. I have the same results (though I didn't use -ddump-simpl) with 6.8.2. Can you recommend a different version to try? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sun, 2009-05-24 at 12:48 -0400, Mario Blažević wrote: How about diffing the whole core output (and using -ddump-simpl). If there's a performance difference then there must be a difference in the core code too. I can't exactly use diff because the generated identifier names are not the same, but after poring over with Emacs ediff I have found only one difference that's not attributable to identifiers: $diff main.simpl imported.simpl ... 223c232 a_s1rs [ALWAYS Just L] :: GHC.Integer.Internals.Integer --- a_s1sV [ALWAYS Just S] :: GHC.Integer.Internals.Integer ... Good find! Does this S vs. L difference have anything to do with strictness and laziness? Yes. So, I think we should open a ghc but report with all the details, particularly the example's source, the ghc version and that highlight of that strictness difference. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Fri 22/05/09 10:51 AM , John Lato jwl...@gmail.com sent: Hi Mario, It looks like the parallelize function is getting inlined when it's in the same file, but not when it's in a separate file. Adding a {-# INLINE parallelize #-} pragma to the module with parallelize recovers all the performance for me. You could probably see exactly what's happening in more detail by going through the Core output. Thank you, this advice helped. The Core output indicates that function `test' evaluates the arguments to `parallelize' before it calls it. In other words, the call to `parallelize' is optimized as a strict function call -- which it is. The problem is that this optimization evaluates the arguments sequentially. Compiling with optimizations turned off regains the parallel execution. I guess I will report this as a GHC bug. Or is it a feature request? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
mblazevic: On Fri 22/05/09 10:51 AM , John Lato jwl...@gmail.com sent: Hi Mario, It looks like the parallelize function is getting inlined when it's in the same file, but not when it's in a separate file. Adding a {-# INLINE parallelize #-} pragma to the module with parallelize recovers all the performance for me. You could probably see exactly what's happening in more detail by going through the Core output. Thank you, this advice helped. The Core output indicates that function `test' evaluates the arguments to `parallelize' before it calls it. In other words, the call to `parallelize' is optimized as a strict function call -- which it is. The problem is that this optimization evaluates the arguments sequentially. Compiling with optimizations turned off regains the parallel execution. I guess I will report this as a GHC bug. Or is it a feature request? As Duncan suggessted, try with GHC head (grab a snapshot). `par` et al are much improved. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
You could probably see exactly what's happening in more detail by going through the Core output. Thank you, this advice helped. The Core output indicates that function `test' evaluates the arguments to `parallelize' before it calls it. In other words, the call to `parallelize' is optimized as a strict function call -- which it is. The problem is that this optimization evaluates the arguments sequentially. Compiling with optimizations turned off regains the parallel execution. I guess I will report this as a GHC bug. Or is it a feature request? As Duncan suggessted, try with GHC head (grab a snapshot). `par` et al are much improved. I already have, with the snapshot from 21st of April. It behaves the same as 6.8.2, except it runs for twice as long. I'd like to take back a part of what I said before, though: `parallelize' should be strict only in its second argument. Its strictness in the first argument should be the same as with `par`. Even though `parallelize x y' always evaluates both x and y, the following test works fine with optimizations even if `parallelize' is imported: main = putStrLn (snd $ parallelize undefined Hello, World!) So the function is not strict, and I don't understand why GHC should evaluate the arguments before the call. Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat, May 23, 2009 at 7:31 PM, Mario Blažević mblaze...@stilo.com wrote: Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? parallelize a b | False = (undefined, undefined) | otherwise = a `par` (b `pseq` (a, b)) might do, unless strictness analysis is smart enough to know that the False guard is always, well, False. --Max ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat, May 23, 2009 at 11:34 AM, Max Rabkin max.rab...@gmail.com wrote: On Sat, May 23, 2009 at 7:31 PM, Mario Blažević mblaze...@stilo.com wrote: Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? parallelize a b | False = (undefined, undefined) | otherwise = a `par` (b `pseq` (a, b)) might do, unless strictness analysis is smart enough to know that the False guard is always, well, False. --Max ___ GHC.Prim.lazy? Alex ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat, 2009-05-23 at 13:31 -0400, Mario Blažević wrote: You could probably see exactly what's happening in more detail by going through the Core output. Thank you, this advice helped. The Core output indicates that function `test' evaluates the arguments to `parallelize' before it calls it. In other words, the call to `parallelize' is optimized as a strict function call -- which it is. The problem is that this optimization evaluates the arguments sequentially. Compiling with optimizations turned off regains the parallel execution. I guess I will report this as a GHC bug. Or is it a feature request? As Duncan suggessted, try with GHC head (grab a snapshot). `par` et al are much improved. I already have, with the snapshot from 21st of April. It behaves the same as 6.8.2, except it runs for twice as long. I'd like to take back a part of what I said before, though: `parallelize' should be strict only in its second argument. parallelize a b = a `par` (b `pseq` (a, b)) GHC infers that strictness of parallelize. It thinks that it is lazy in both args (you can check this yourself using ghc --show-iface). That's because the definitions of par and pseq use the function 'lazy': -- The reason for the strange lazy call is that it fools -- the compiler into thinking that pseq and par are non-strict in -- their second argument (even if it inlines pseq at the call site). -- If it thinks pseq is strict in y, then it often evaluates -- y before x, which is totally wrong. pseq x y = x `seq` lazy y par x y = case (par# x) of { _ - lazy y } So GHC thinks that par is lazy in both arguments, while it thinks pseq is strict in the first and lazy in the second. Its strictness in the first argument should be the same as with `par`. Yes, which it is. Even though `parallelize x y' always evaluates both x and y, Be careful about what you mean. Yes, it starts the evaluation of x in parallel with y, but that does not mean it is strict in x, as you notice with your example of undefined below. the following test works fine with optimizations even if `parallelize' is imported: main = putStrLn (snd $ parallelize undefined Hello, World!) So the function is not strict, and I don't understand why GHC should evaluate the arguments before the call. Right, it's lazy in the first and strict in the second argument. As far as I can see we have no evidence that is is evaluating anything before the call. Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? Yes, which is what pseq and par already do. If there's a bug, we need to reproduce it and report it. I cannot reproduce it. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat 23/05/09 4:15 PM , Alexander Dunlap alexander.dun...@gmail.com sent: Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? parallelize a b | False = (undefined, undefined) | otherwise = a `par` (b `pseq` (a, b)) might do, unless strictness analysis is smart enough to know that the False guard is always, well, False. ___ GHC.Prim.lazy? It's GHC.Exts.lazy nowadays, and it doesn't have any effect. Neither does the `| False' guard. The only way I found to disable the eager argument evaluation is to pass them in as functions: parallelize :: Num t = (t - a) - (t - b) - (a, b) parallelize a b = let a' = a 1 b' = b 1 in (a' `par` (b' `pseq` (a', b'))) Then it can be imported and called like this: test n1 n2 = let (p1, p2) = parallelize (\n0- product $ factors $ product [n0..n1]) (\n0- product $ factors $ product [n0..n2]) This solution is incredibly fragile. If the declared type of parallelize is modified by replacing t by Integer, the evaluation becomes eager again. Also, the argument functions can't simply take () for argument which would make this solution reasonable. If this is all the strictness optimizer's fault, I'm awed by how difficult it is to trick it into not doing its job. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re: Re: [Haskell-cafe] Re: A problem with par and modules boundaries...
On Sat 23/05/09 2:51 PM , Duncan Coutts duncan.cou...@worc.ox.ac.uk sent: On Sat, 2009-05-23 at 13:31 -0400, Mario Blažević wrote: ... So the function is not strict, and I don't understand why GHC should evaluate the arguments before the call. Right, it's lazy in the first and strict in the second argument. As far as I can see we have no evidence that is is evaluating anything before the call. When I look at the Core definition of `test', it begins with \ (n1axl::integer:GHCziIntegerziInternals.Integer) (n2axn::integer:GHCziIntegerziInternals.Integer) - %let as1sU :: integer:GHCziIntegerziInternals.Integer = base:DataziList.prod1 (main:Main.factors2 (base:DataziList.prod1 (base:GHCziNum.upzulist main:Main.lvl main:Main.lvl n1axl) base:DataziList.lvl1)) base:DataziList.lvl1 %in %case integer:GHCziIntegerziInternals.Integer (ghczmprim:GHCziPrim.parzh @ integer:GHCziIntegerziInternals.Integer as1sU) %of (dsapq::ghczmprim:GHCziPrim.Intzh) To my untrained eyes, this looks like it's evaluating product $ factors $ product [1..n1]) which is the first argument to `parallelize'. I assume that %case in Core evaluates the argument to WHNF, just like case in Haskell. Then again, I could be completely misinterpreting what Core is, because I can't find any call to `parallelize' before or after that. It appears to be inlined in Core, regardless of whether the pragma {-# INLINE parallelize #-} is there or not. Actually, I can't see any effect of that pragma in the core files whatsoever, but it certainly has effect on run time. Does anybody know of a pragma or another way to make a function *non-strict* even if it does always evaluate its argument? In other words, is there a way to selectively disable the strictness optimization? Yes, which is what pseq and par already do. If there's a bug, we need to reproduce it and report it. I cannot reproduce it. If you mean that you can't reproduce anything that's contrary to the specification, that's not saying much: there are practically no guarantees on what `par' is supposed to accomplish. If you mean you can't reproduce anything you wouldn't expect, pray explain what is going on, because everybody else seems to be surprised. Or do you mean to say that *your* installation of GHC behaves the same when the function `parallelize' is defined in the same module and when it's imported? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe