Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread John Hughes

John Hughes wrote:

The trouble is that this isn't always an optimisation. Try these two
programs:

powerset [] = [[]]
powerset (x:xs) = powerset xs++map (x:) (powerset xs)

and

powerset [] = [[]]
powerset (x:xs) = pxs++map (x:) pxs
 where pxs = powerset xs

Try computing length (powerset [1..n]) with each definition. For small
n, the second is faster. As n gets larger, the second gets slower and
slower, but the first keeps chugging along. The problem is that the
second has exponentially higher peak memory requirements than the
first. Round about n=25, on my machine, all other programs stop 
responding

while the second one runs. You don't really want a compiler to make
that kind of "pessimisation" to your program, which is why it's a
deliberate decision to leave most CSE to the programmer. You can
still write the second version, and suffer the consequences, but at least 
you know

it's your own fault!


Thanks for the above example. I found it quite difficult to understand why 
the second is worse than the first for large n, but I think the reason is 
that you're using the second def in conjunction with (length). Therefore 
it is the *combination* of the cse'd (powerset) with (length) that is less 
efficient, because (length) just reads its input as a stream so there is 
no need for the whole of (powerset xs) to exist in memory thus the non 
cse'd version gives a faster (length . powerset).


Yes... not just length, of course, but any function that consumes its input 
"lazily", or perhaps I should say "in one pass". For example, if you print 
out the result of powerset, then the print function makes only one pass over 
it, and the first version will run in linear space in n, while the second 
takes exponential. But then you'll be doing so much I/O that you won't be 
able to run the code for such large n in reasonable time--that's the reason 
I chose length in my example, it's a list consumer that isn't I/O-bound.


Just to be explicit, the reason the second is worse is that the pointer to 
pxs from the expression map (x:) pxs prevents the garbage collector from 
recovering the space pxs occupies, while the pxs++ is being computed and 
consumed. So you end up computing all of pxs while the pxs++ is running, AND 
STORING THE RESULT, and then making a second pass over it with map (x:) pxs, 
during which pxs can be garbage collected as it is processed. In the first 
version, we compute powerset xs twice, but each time, every cell is 
constructed, then immediately processed and discarded, so every garbage 
collection reclaims almost all the allocated memory.


Ideally it would be great if the compiler could make use of the context in 
which a function is being applied to produce optimized code across 
function boundaries. In the above example of (length . powerset), (length) 
has no interest in the contents of the powerset itself so could the 
compiler not fuse (length . powerset) into the following function:


   lengthPowerset [] = 1
   lengthPowerset (x:xs) = 2 * lengthPowerset xs

The compiler would need to analyse the definition of (++) and (map) to 
discover that


   length (x ++ y) === length x + length y

   length (map f y) === length y

and with that knowledge I imagine the steps could be something like:

   lengthPowerset [] = length (powerset []) = length ([[]]) = 1

   lengthPowerset (x:xs)
   = length (powerset xs ++ map (:x) (powerset xs))
   = length (powerset xs) + length (map (:x) (powerset xs))
   = length (powerset xs) + length (powerset xs)
   = lengthPowerset xs + lengthPowerset xs
   = 2 * lengthPowerset xs

After getting the function (lengthPowerset) as above, I'd also expect the 
compiler to apply a transformation into a tail recursive function:


   lengthPowerset y = lengthPowerset' y 1
   where
   lengthPowerset' [] i = i
   lengthPowerset' (_:xs) i = lengthPowerset' xs $! 2*i

resulting in a tightly coded machine code loop to rival, or greatly 
exceed(!), the best efforts of C.


In the meantime I tend to code in Haskell just expecting these kind of 
optimizations to be done (unless I'm writing a really time-critical piece 
of code that can't wait), knowing of course that they might not be done 
just at the moment but at least some time in the (hopefully not too 
distant) future... ;-)


Regards, Brian.


You know, I suspect you could get a lot of this to happen by programming 
GHC's optimiser using rewrite rules. But I'm going to leave it as an 
exercise for the reader (he he:-). For the compiler to do all this without 
guidance, would, I suspect, require much more theorem proving than it will 
be reasonable for compilers to do for a long. long time.


John 




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread Brian Hulley

John Hughes wrote:

The trouble is that this isn't always an optimisation. Try these two
programs:

powerset [] = [[]]
powerset (x:xs) = powerset xs++map (x:) (powerset xs)

and

powerset [] = [[]]
powerset (x:xs) = pxs++map (x:) pxs
 where pxs = powerset xs

Try computing length (powerset [1..n]) with each definition. For small
n, the second is faster. As n gets larger, the second gets slower and
slower, but the first keeps chugging along. The problem is that the
second has exponentially higher peak memory requirements than the
first. Round about n=25, on my machine, all other programs stop responding
while the second one runs. You don't really want a compiler to make
that kind of "pessimisation" to your program, which is why it's a
deliberate decision to leave most CSE to the programmer. You can
still write the second version, and suffer the consequences, but at least 
you know

it's your own fault!


Thanks for the above example. I found it quite difficult to understand why 
the second is worse than the first for large n, but I think the reason is 
that you're using the second def in conjunction with (length). Therefore it 
is the *combination* of the cse'd (powerset) with (length) that is less 
efficient, because (length) just reads its input as a stream so there is no 
need for the whole of (powerset xs) to exist in memory thus the non cse'd 
version gives a faster (length . powerset).


Ideally it would be great if the compiler could make use of the context in 
which a function is being applied to produce optimized code across function 
boundaries. In the above example of (length . powerset), (length) has no 
interest in the contents of the powerset itself so could the compiler not 
fuse (length . powerset) into the following function:


   lengthPowerset [] = 1
   lengthPowerset (x:xs) = 2 * lengthPowerset xs

The compiler would need to analyse the definition of (++) and (map) to 
discover that


   length (x ++ y) === length x + length y

   length (map f y) === length y

and with that knowledge I imagine the steps could be something like:

   lengthPowerset [] = length (powerset []) = length ([[]]) = 1

   lengthPowerset (x:xs)
   = length (powerset xs ++ map (:x) (powerset xs))
   = length (powerset xs) + length (map (:x) (powerset xs))
   = length (powerset xs) + length (powerset xs)
   = lengthPowerset xs + lengthPowerset xs
   = 2 * lengthPowerset xs

After getting the function (lengthPowerset) as above, I'd also expect the 
compiler to apply a transformation into a tail recursive function:


   lengthPowerset y = lengthPowerset' y 1
   where
   lengthPowerset' [] i = i
   lengthPowerset' (_:xs) i = lengthPowerset' xs $! 2*i

resulting in a tightly coded machine code loop to rival, or greatly 
exceed(!), the best efforts of C.


In the meantime I tend to code in Haskell just expecting these kind of 
optimizations to be done (unless I'm writing a really time-critical piece of 
code that can't wait), knowing of course that they might not be done just at 
the moment but at least some time in the (hopefully not too distant) 
future... ;-)


Regards, Brian.
--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread John Hughes



On 9/6/06, Tamas K Papp <[EMAIL PROTECTED]> wrote:
 


or does the compiler perform this optimization?  More generally, if a
function is invoked with the same parameters again (and it doesn't
involve anything like monads), does does it makes sense
(performancewise) to "store" the result somewhere?
   



I was wondering something like this too, and then I found this email:
http://www.haskell.org/pipermail/glasgow-haskell-bugs/2004-December/004530.html

So I guess it is as Stephane said: theoretically possible but not actually done?

--eric the perpetual newbie

 

The trouble is that this isn't always an optimisation. Try these two 
programs:


powerset [] = [[]]
powerset (x:xs) = powerset xs++map (x:) (powerset xs)

and

powerset [] = [[]]
powerset (x:xs) = pxs++map (x:) pxs
 where pxs = powerset xs

Try computing length (powerset [1..n]) with each definition. For small 
n, the second is faster. As n gets larger, the second gets slower and 
slower, but the first keeps chugging along. The problem is that the 
second has exponentially higher peak memory requirements than the first. 
Round about n=25, on my machine, all other programs stop responding 
while the second one runs. You don't really want a compiler to make that 
kind of "pessimisation" to your program, which is why it's a deliberate 
decision to leave most CSE to the programmer. You can still write the 
second version, and suffer the consequences, but at least you know it's 
your own fault!


John

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread Donald Bruce Stewart
tpapp:
> Hi,
> 
> I have a question about coding and compilers.  Suppose that a function
> is invoked with the same parameters inside another function declaration, eg
> 
> -- this example does nothing particularly meaningless
> g a b c = let something1 = f a b
> something2 = externalsomething (f a b) 42
> something3 = externalsomething2 (137 * (f a b)) in
> ...
> 
> Does it help (performancewise) to have
> 
> g a b c = let resultoff = f a b
> something2 = externalsomething resultoff 42
> something3 = externalsomething2 (137 * resultoff) in
> ...
> 
> or does the compiler perform this optimization?  More generally, if a
> function is invoked with the same parameters again (and it doesn't
> involve anything like monads), does does it makes sense
> (performancewise) to "store" the result somewhere?
> 

on the wiki,
http://www.haskell.org/haskellwiki/Performance/GHC#Common_subexpressions

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread Eric Kow

Hi,

On 9/6/06, Tamas K Papp <[EMAIL PROTECTED]> wrote:

or does the compiler perform this optimization?  More generally, if a
function is invoked with the same parameters again (and it doesn't
involve anything like monads), does does it makes sense
(performancewise) to "store" the result somewhere?


I was wondering something like this too, and then I found this email:
http://www.haskell.org/pipermail/glasgow-haskell-bugs/2004-December/004530.html

So I guess it is as Stephane said: theoretically possible but not actually done?

--eric the perpetual newbie

--
Eric Kow http://www.loria.fr/~kow
PGP Key ID: 08AC04F9 Merci de corriger mon français.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] does the compiler optimize repeated calls?

2006-09-06 Thread Alex Queiroz

Hallo,

On 9/6/06, Tamas K Papp <[EMAIL PROTECTED]> wrote:


PS: I realize that I am asking a lot of newbie questions, and I would
like to thank everybody for their patience and elucidating replies.


   I am a newbie myself (second week of learning Haskell), but I'll
give it a shot: Since functions have no side effects, the compiler
executes the function only once.

--
-alex
http://www.ventonegro.org/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe