>> ghc-2.08-linux-i386-all  binary snapshot  ...
>> ...
>> A small program Main.hs compiled with -O runs 10 times faster or 
>> slower depending on the export list
>>                        module Main (main)
>> or                     module Main (main,test)
>> Is this a bug?


> Can you send the code?

It is in the end of this letter.


>> does it worth to set things like  {-# inline 0 f #-}  or  
>>                                   {-# inline 1 f #-}  
>> - has  ghc -c -O... to be clever enough to guess what to inline?


> It's worth it if you to be sure they'll be inlined.


You mean you *recommend* to set  {-# inline l f #-}  for any 
function (expression?)  f  the programmer is sure its in-lining 
helps?  And  l = 0,1...  ?
My dim idea was that  -O  has to do most of the useful in-linings,
so that the ordinary user better not to intrude into this ...


>> I am trying  -O  in Makefile  for some critical modules of certain 
>> large project, but cannot predict when this occures useful.
>> ...
>> In  ghc-0.29  the performance for -O was predictable.

> ...  Just say "-O" and your program should run faster.  How much 
> faster does depend a lot on the program.
> ..
> if you find a program where 0.29 makes a worthwhile improvement and
> 2.08 does not then please send it to us ...


It looks like I could restore which tests have to run  1.5-2.5  times 
faster.
But let we first find how to -O the below simple project:




-----------------------------------------------------------------
module  M (short)  where

-- Number of cycles in permutation.
-- Nice, but not too efficient for large lists.

short :: Eq a =>  [a] -> (a -> a) -> Integer
               -- xs     f           
         -- Method:
         -- if  [x] in  (x:xs)  is not a cycle,  and  f(x) = y,
         -- then make certain  f1  on  xs  that differs from  f 
         -- only in counter-image of  x  and recurse.

short [] _ =  error "short [] _"
short xs f =  nc  xs f    
  where
  nc []     _ =  0
  nc (x:xs) f =  let  y = f x  in

    if y==x then  1+ (nc xs f)  else   nc xs (f1 y)    where

                               f1 y z =  let  z1=(f z)  in
                                           if z1==x then y else z1
------------------------------------------------------------------
-- BENCHMARK.
-- Build a list  ns = [0..n-1] ;
-- form the permutations   \x -> mod (x+m) n,     m <- ns;
-- compute and print the number of cycles for each of the above 
-- permutations.

module  Main (main)  where

import M (short)

main =  writeFile "result" (show (test 300))
 
test n =  let  { ns = [0..(n-1)];  tr m k = mod (k+m) n }
          in   
               map  (\m-> short ns (tr m))  ns
------------------------------------------------------------------



"Making":
ghc -c -O M.hs;    ghc -c Main.hs;    ghc -o run Main.o M.o;

Running:
time  ./run  +RTS -H8m -RTS    takes  70 sec.  (on my computer).

This has to write to  ./result:

[300, 1, 2, 3, 4, 5, 6, 1, 4, 3, 10, 1, 12, 1, 2, 15, 4, 1, 6, 1, 20, 
 3, 2, 1, 12, 25, 2, 3, 4, 1, 30, 1, 4, 3, 2, 5, 12, 1, 2, 3, 20,
 1, 6, 1, 4, 15, 2, 1, 12, 1, 50, 3, 4, 1, 6, 5, 4, 3, 2, 1, 60,
 1, 2, 3, 4, 5, 6, 1, 4, 3, 10, 1, 12, 1, 2, 75, 4, 1, 6, 1, 20, 
 3, 2, 1, 12, 5, 2, 3, 4, 1, 30, 1, 4, 3, 2, 5, 12, 1, 2, 3, 100, 
 1, 6, 1, 4, 15, 2, 1, 12, 1, 10, 3, 4, 1, 6, 5, 4, 3, 2, 1, 60, 
 1, 2, 3, 4, 25, 6, 1, 4, 3, 10, 1, 12, 1, 2, 15, 4, 1, 6, 1, 20,
 3, 2, 1, 12, 5, 2, 3, 4, 1, 150, 1, 4, 3, 2, 5, 12, 1, 2, 3, 20, 
 1, 6, 1, 4, 15, 2, 1, 12, 1, 10, 3, 4, 1, 6, 25, 4, 3, 2, 1, 60, 
 1, 2, 3, 4, 5, 6, 1, 4, 3, 10, 1, 12, 1, 2, 15, 4, 1, 6, 1, 100, 
 3, 2, 1, 12, 5, 2, 3, 4, 1, 30, 1, 4, 3, 2, 5, 12, 1, 2, 3, 20,
 1, 6, 1, 4, 75, 2, 1, 12, 1, 10, 3, 4, 1, 6, 5, 4, 3, 2, 1, 60, 
 1, 2, 3, 4, 5, 6, 1, 4, 3, 50, 1, 12, 1, 2, 15, 4, 1, 6, 1, 20, 
 3, 2, 1, 12, 5, 2, 3, 4, 1, 30, 1, 4, 3, 2, 25, 12, 1, 2, 3, 20, 
 1, 6, 1, 4, 15, 2, 1, 12, 1, 10, 3, 4, 1, 6, 5, 4, 3, 2, 1
]
------------------------------------------------------------------


Joining these modules into *one*   

                   module Main (main) where ...

and compiling with  -O  gives  6.1 sec.
This is the *only* situation, I found -O working.

Thus, if we set then      `Main (main,test) where' 

- just for curiosity - we return to 70 sec.




------------------
Sergey Mechveliani

[EMAIL PROTECTED]



Reply via email to