RE: Optimization beyond the Module Border

2008-03-21 Thread Simon Peyton-Jones
| Would it be possible for the compiler to say something like: You are
| applying level 2 optimization but some dependencies where compiled without
| optimization enabled. To get full optimization, consider recompiling x,y,z
| with -O2 - at least this would give us a fighting chance to 'fix' things.

That's a reasonable suggestion.  Currently the .hi file for a module does not 
record whether or not optimisation was on when compiling that module, so this'd 
mean an interface-file format change, plus a bit of jiggery pokery.

Why not submit a Trac feature request?

Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-20 Thread Bernd Brassel
 I suspect that if all modules are compiled -O0, then you recompile one
 module with -O2, high up in the dependency graph (i.e. it depends on
 many lower-level modules), plus all things that in turn depend on it
 (--make), you will not get the good performance you expect.  None of the
 lower-level functions will have exported inlinings or fusion rules into
 the interface file.  _All_ modules must be recompiled with -O2,
 especially the bottom of the dependency chain, to get the best benefit
 from optimisation.
 
 Regards,
 Malcolm

I am very sorry, I think what Malcolm describes might be exactly what
had happened. Now that I tried to blow up the example from 0.122 msec to
get a more significant result, I can't reproduce the effect. Funny thing
though, as I was pretty keen on doing a thorough job as it was all about
measuring the quality of the work of the previous fortnight. Now I find
that - after all - I did a much better job than it seemed yesterday :o)

So there may be two (minor) issues left if you would be interested.
Firstly, about profiling in connection with optimization. When I
compiled things with -O2 AND -prof -auto-all no profile would be
written. Now you might think that having both at once is a silly idea,
the side effects of profiling might be the first to be optimized away.
But I think it was not so silly after all as I had introduced a lot of
overhead into my programs which I was pretty sure could be optimized
away. Hence, I was not at all interested in the unoptimized profile. And
I think it is not so unusual to want to improve only those things that
the compiler cannot improve by itself. Couldn't the profiling things be
added AFTER all optimization was done?

And then, secondly, about the connection of optimization with side
effects. I had programs behave differently when compiling them all in
one go or module-wise. (And if I am not able to reproduce that effect as
well I will do a little merry dance!) Is this also interesting? Might it
be connected with what Don mentioned about the stream fusion package?
(Although I cannot remember mentioning any side effects in Duncan's talk
in Freiburg.)

Thanks for your time and sorry once again for using the system all wrong!
Bernd


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Optimization beyond the Module Border

2008-03-20 Thread Simon Peyton-Jones
|  I'd be interested in any progress here -- we noticed issues with
|  optimisations in the stream fusion package across module boundaries
|  that we never tracked down. If there's some key things not firing,
|  that would be good to know.
|
| I suspect that if all modules are compiled -O0, then you recompile one
| module with -O2, high up in the dependency graph (i.e. it depends on
| many lower-level modules), plus all things that in turn depend on it
| (--make), you will not get the good performance you expect.  None of the
| lower-level functions will have exported inlinings or fusion rules into
| the interface file.  _All_ modules must be recompiled with -O2,
| especially the bottom of the dependency chain, to get the best benefit
| from optimisation.

Absolutely correct.

Should this be better documented?  If so, would someone like to think where in 
GHC's user manual they would have looked (or did look), and send me some text 
that would have helped them, had it been there?  As it were.

Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-20 Thread Ian Lynagh
On Thu, Mar 20, 2008 at 09:47:28AM +0100, Bernd Brassel wrote:
 
 compiled things with -O2 AND -prof -auto-all no profile would be
 written.

This should work, for the reasons that you give. Did you use options
like +RTS -p when running the program? If so, please give us an example
to reproduce the problem.

 And then, secondly, about the connection of optimization with side
 effects. I had programs behave differently when compiling them all in
 one go or module-wise.

Again, please tell us how to reproduce this.


Thanks
Ian

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Optimization beyond the Module Border

2008-03-20 Thread Matthew Pocock
 |  I'd be interested in any progress here -- we noticed issues with
 |  optimisations in the stream fusion package across module boundaries
 |  that we never tracked down. If there's some key things not firing,
 |  that would be good to know.
 |
 | I suspect that if all modules are compiled -O0, then you recompile one
 | module with -O2, high up in the dependency graph (i.e. it depends on
 | many lower-level modules), plus all things that in turn depend on it
 | (--make), you will not get the good performance you expect.  None of the
 | lower-level functions will have exported inlinings or fusion rules into
 | the interface file.  _All_ modules must be recompiled with -O2,
 | especially the bottom of the dependency chain, to get the best benefit
 | from optimisation.

 Absolutely correct.

 Should this be better documented?  If so, would someone like to think
 where in GHC's user manual they would have looked (or did look), and send
 me some text that would have helped them, had it been there?  As it were.

 Simon

Would it be possible for the compiler to say something like: You are
applying level 2 optimization but some dependencies where compiled without
optimization enabled. To get full optimization, consider recompiling x,y,z
with -O2 - at least this would give us a fighting chance to 'fix' things.

Matthew

 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-20 Thread Don Stewart
bbr:
  I suspect that if all modules are compiled -O0, then you recompile one
  module with -O2, high up in the dependency graph (i.e. it depends on
  many lower-level modules), plus all things that in turn depend on it
  (--make), you will not get the good performance you expect.  None of the
  lower-level functions will have exported inlinings or fusion rules into
  the interface file.  _All_ modules must be recompiled with -O2,
  especially the bottom of the dependency chain, to get the best benefit
  from optimisation.
  
  Regards,
  Malcolm
 
 I am very sorry, I think what Malcolm describes might be exactly what
 had happened. Now that I tried to blow up the example from 0.122 msec to
 get a more significant result, I can't reproduce the effect. Funny thing
 though, as I was pretty keen on doing a thorough job as it was all about
 measuring the quality of the work of the previous fortnight. Now I find
 that - after all - I did a much better job than it seemed yesterday :o)
 
 So there may be two (minor) issues left if you would be interested.
 Firstly, about profiling in connection with optimization. When I
 compiled things with -O2 AND -prof -auto-all no profile would be
 written. Now you might think that having both at once is a silly idea,
 the side effects of profiling might be the first to be optimized away.

You almost always want to profile with full optimisations on.
Otherwise its not even close to measuring the kind of code you're
actually running.

 But I think it was not so silly after all as I had introduced a lot of
 overhead into my programs which I was pretty sure could be optimized
 away. Hence, I was not at all interested in the unoptimized profile. And
 I think it is not so unusual to want to improve only those things that
 the compiler cannot improve by itself. Couldn't the profiling things be
 added AFTER all optimization was done?
 
 And then, secondly, about the connection of optimization with side
 effects. I had programs behave differently when compiling them all in
 one go or module-wise. (And if I am not able to reproduce that effect as
 well I will do a little merry dance!) Is this also interesting? Might it
 be connected with what Don mentioned about the stream fusion package?
 (Although I cannot remember mentioning any side effects in Duncan's talk
 in Freiburg.)

No, I can't think of any issue there.
  
-- Don
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-20 Thread Bernd Brassel
Don Stewart wrote:
 You almost always want to profile with full optimisations on.
 Otherwise its not even close to measuring the kind of code you're
 actually running.

Ian Lynagh wrote:
 This should work, for the reasons that you give. Did you use options
 like +RTS -p when running the program?

Yes guys, you are sooo right... And I think it is really time for the
easter holidays! Not that I forgot to turn on the RTS option but
something close to as stupid if not worse: I looked into the wrong file.
:o((

My gosh, before I stay to get the trophy of the most stupid mail of the
list, I think I will go home now...

Have a happy Easter!

Bernd
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Optimization beyond the Module Border

2008-03-19 Thread Bernd Brassel
Hi all,

I have noticed that there is a great difference between optimizing
modules separately and all at once, e.g., with -fforce-recomp. I have
had examples factors up to 15 in run time (and even different behavior
in context with unsafePerformIO).

Is there any option that makes ghc write out that intermediate
optimization data he seems to use in order to get the same efficiency in
a module-wise compilation?

Greetings
Bernd
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Optimization beyond the Module Border

2008-03-19 Thread Simon Peyton-Jones
| I have noticed that there is a great difference between optimizing
| modules separately and all at once, e.g., with -fforce-recomp. I have
| had examples factors up to 15 in run time (and even different behavior
| in context with unsafePerformIO).

GHC does a lot of cross-module inlining already, and *does* write stuff into 
interface files, provided you use -O.

I'm always interested in performance differences of a factor of 15 though!  Can 
you supply an example (as small as poss) for us to look at?

Thanks

Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-19 Thread Bernd Brassel
Simon Peyton-Jones wrote:

 GHC does a lot of cross-module inlining already, and *does* write stuff into 
 interface files, provided you use -O.

I used -O4. Is that the bad thing?

 I'm always interested in performance differences of a factor of 15 though!  
 Can you supply an example (as small as poss) for us to look at?

Yes certainly, although small will be a big problem, I guess.
I admit that the factor 15 is a bit dubious since the fast run-time was
so small (1.88 sec vs. 0.112 sec).

I will see what I can do on the morrow.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-19 Thread Don Stewart
bbr:
 Simon Peyton-Jones wrote:
 
  GHC does a lot of cross-module inlining already, and *does* write stuff 
  into interface files, provided you use -O.
 
 I used -O4. Is that the bad thing?

There's nothing about -O2 

However, I think that's ok -- it clamps -ON | N2 to -O2


  I'm always interested in performance differences of a factor of 15 though!  
  Can you supply an example (as small as poss) for us to look at?
 
 Yes certainly, although small will be a big problem, I guess.
 I admit that the factor 15 is a bit dubious since the fast run-time was
 so small (1.88 sec vs. 0.112 sec).
 
 I will see what I can do on the morrow.

I'd be interested in any progress here -- we noticed issues with
optimisations in the stream fusion package across module boundaries
that we never tracked down. If there's some key things not firing,
that would be good to know.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Optimization beyond the Module Border

2008-03-19 Thread Malcolm Wallace
 I'd be interested in any progress here -- we noticed issues with
 optimisations in the stream fusion package across module boundaries
 that we never tracked down. If there's some key things not firing,
 that would be good to know.

I suspect that if all modules are compiled -O0, then you recompile one
module with -O2, high up in the dependency graph (i.e. it depends on
many lower-level modules), plus all things that in turn depend on it
(--make), you will not get the good performance you expect.  None of the
lower-level functions will have exported inlinings or fusion rules into
the interface file.  _All_ modules must be recompiled with -O2,
especially the bottom of the dependency chain, to get the best benefit
from optimisation.

Regards,
Malcolm
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users