Re: Reduction count as efficiency measure?

1998-11-30 Thread Olaf Chitil

Bart Demoen wrote:
 
 Simon wrote:
 
  CSE can also make space consumption asymptotically worse.
 
 I can understand this when garbage collection is around and because of
 CSE, some data structures don't get cleaned up; but is it also true
 without gc ?

If you don't use a rule like

e'[e,e] -- let x = e in e'[x,x]

for CSE, but limit yourself to rules like

let x = e in e'[e]  -- let x = e in e'[x]  

then you are right that the transformed program will not allocate more closures
than the old (in fact less in most cases).

However, the liveness of heap cells will (often) increase and hence they cannot
be collected by garbage collection as early as in the original program.
Just the elimination of a few common subexpressions can increase heap residency
of a program considerably.

If want to know more about why I think that CSE is unfortunately not suitable
for lazy functional languages have a look at my paper `Common Subexpressions are
Uncommon in Lazy Functional Languages' at
http://www-i2.informatik.rwth-aachen.de/~chitil/PUBLICATIONS/comSubsAreUncom.html

Olaf

-- 
OLAF CHITIL, Lehrstuhl fuer Informatik II, RWTH Aachen, 52056 Aachen, Germany
 Tel: (+49/0)241/80-21212; Fax: (+49/0)241/-217
 URL: http://www-i2.informatik.rwth-aachen.de/~chitil/





RE: Reduction count as efficiency measure?

1998-11-26 Thread Simon Peyton-Jones

 I'm still curious about my first question, though, about the specific
 optimizations included in ghc and hbc.  If in fact they don't do CSE,
 are there optimizations which they do perform which would change the
 asymptotic running time?

GHC doesn't do CSE, and this is part of the reason... generally
I don't think that compilers should change asymptotic running time.
CSE can also make space consumption asymptotically worse.

Having said that, full laziness (which GHC does do) can change
asymptotic running time.  Consider

f n = let g m = h n + m
in sum (map g [1..n])

where h is linear in n.  The running time is n**2, but if you
lift out the invariant (h n) from the inside of g, you get back
to linear.

Simon





Re: Reduction count as efficiency measure?

1998-11-25 Thread Carl R. Witty

Keith Wansbrough [EMAIL PROTECTED] writes:

 So while Hugs gives you a reduction count (or even a millisecond
 duration), this is essentially meaningless: in a real application
 you would compile the code with an optimising compiler.  The effect
 this can have on your execution time can easily be more than merely
 a constant factor: it can change the order of your algorithm.

Is this true in practice?  That is, are there programs which have
different asymptotic running times when compiled under ghc or hbc than
when running under Hugs?

It would actually surprise me if there were; I'm having a hard time
imagining a realistic optimization that would do this.  (Which could
easily be a failure of my imagination.)

Carl Witty
[EMAIL PROTECTED]





Re: Reduction count as efficiency measure?

1998-11-25 Thread Fergus Henderson

On 24-Nov-1998, Jan Skibinski [EMAIL PROTECTED] wrote:
 
 On Tue, 24 Nov 1998, Lennart Augustsson wrote:
 
  I'm not sure what you're trying to accomplish.  If you want to
  decide from a theoretical point of view which is the better
  algorithm then you should do something more abstract.
  If you're trying to decide which algorithm is better in a
  certain situation then you should time it in that context.
  Comparing reduction counts in Hugs will not help you much
  if you ultimately want to run your code compiled with e.g. hbc.
  Something which is better with one compiler can be worse
  on another.
 
   I am simply trying to choose the best tool for certain
   class of algorithms. 
   Given a choice of Array, List and one of the varieties
   of Random Accees List; and assuming that I am not doing
   anything really stupid implementation-wise, such as using
   indexing for Lists; the practical question is:
   which of those data structures would give me the best
   response time?

I suspect that array performance in Haskell is quite likely to be
strongly dependent on compiler optimizations.  Using Hugs for
performance comparisons may (or may not) give you reasonable results
for comparing List and Random Access List, but for comparisons with
Array I'd advise you use something that is close to what you will use
for the final implementation.

-- 
Fergus Henderson [EMAIL PROTECTED]  |  "Binaries may die
WWW: http://www.cs.mu.oz.au/~fjh  |   but source code lives forever"
PGP: finger [EMAIL PROTECTED]| -- leaked Microsoft memo.





Re: Reduction count as efficiency measure?

1998-11-25 Thread Ralf Hinze

| Is this true in practice?  That is, are there programs which have
| different asymptotic running times when compiled under ghc or hbc than
| when running under Hugs?
| 
| It would actually surprise me if there were; I'm having a hard time
| imagining a realistic optimization that would do this.  (Which could
| easily be a failure of my imagination.)

What about common subexpression elimination? AFAIK neither Hugs nor GHC
nor HBC implement CSE, but if they did the asymptotic running time
could sometimes radically change. Here is a nice example several
first-year students came up with when I asked them to program `maximum'.

 maximum [a]   =  a
 maximum (a : as)  =  if a  maximum as then maximum as else a

Thus implemented `maximum' has an exponential running time. Eliminating
the double call `maximum as' yields a linear implementation.

 maximum [a]   =  a
 maximum (a : as)  =  let m = maximum as in if a  m then m else a

Does this example convince you?

Cheers, Ralf





Re: Reduction count as efficiency measure?

1998-11-25 Thread Carl R. Witty

Ralf Hinze [EMAIL PROTECTED] writes:

 | Is this true in practice?  That is, are there programs which have
 | different asymptotic running times when compiled under ghc or hbc than
 | when running under Hugs?
 | 
 | It would actually surprise me if there were; I'm having a hard time
 | imagining a realistic optimization that would do this.  (Which could
 | easily be a failure of my imagination.)
 
 What about common subexpression elimination? AFAIK neither Hugs nor GHC
 nor HBC implement CSE, but if they did the asymptotic running time
 could sometimes radically change. 

OK, so my imagination is severely lacking. :-)

I'm still curious about my first question, though, about the specific
optimizations included in ghc and hbc.  If in fact they don't do CSE,
are there optimizations which they do perform which would change the
asymptotic running time?

Carl Witty
[EMAIL PROTECTED]





RE: Reduction count as efficiency measure?

1998-11-24 Thread Hans Aberg

At 05:12 -0600 1998/11/24, Jan Skibinski wrote:
 To paraphrase George Orwell, "All reductions are equal, but some are
 more equal than others."
...
So are assembly language instructions. Yet, I could think about
some average instruction size for similarly written programs.
Naive as my question might have been, I asked it anyway
in hope to receive some insight how to compare similar algorithms
in Hugs.
...
If neither the reduction count nor the timing are appriopriate
measures of efficiency in Hugs, then what is? Is there any
profiling tool available for the interpreter?

Since modern CPU's are developed as to make more commonly used assembler
instructions faster, the only way to find out the speed of the components
of a program is to use a profiler.

  Hans Aberg
  * Email: Hans Aberg mailto:[EMAIL PROTECTED]
  * Home Page: http://www.matematik.su.se/~haberg/
  * AMS member listing: http://www.ams.org/cml/






RE: Reduction count as efficiency measure?

1998-11-24 Thread Jan Skibinski



 If neither the reduction count nor the timing are appriopriate
 measures of efficiency in Hugs, then what is? Is there any
 profiling tool available for the interpreter?
 
 Since modern CPU's are developed as to make more commonly used assembler
 instructions faster, the only way to find out the speed of the components
 of a program is to use a profiler.
 
Looks like you missed my last question:
"Is there any profiling tool available for the interpreter?"
I meant: "for Hugs" but if HBI has some I would gladly
use it.

Jan 





RE: Reduction count as efficiency measure?

1998-11-24 Thread Hans Aberg

At 09:49 -0600 1998/11/24, Jan Skibinski wrote:
 Since modern CPU's are developed as to make more commonly used assembler
 instructions faster, the only way to find out the speed of the components
 of a program is to use a profiler.

   Looks like you missed my last question:
   "Is there any profiling tool available for the interpreter?"
   I meant: "for Hugs" but if HBI has some I would gladly
   use it.

This is what I meant, too: A profiler for Hugs that allows one to check the
time of the Haskell program components it interprets. What did you think I
meant?

  Hans Aberg
  * Email: Hans Aberg mailto:[EMAIL PROTECTED]
  * Home Page: http://www.matematik.su.se/~haberg/
  * AMS member listing: http://www.ams.org/cml/






Re: Reduction count as efficiency measure?

1998-11-24 Thread Graeme Moss


...(continued)...

Sorry, I forgot to include my references:

[1]

@InProceedings{oka95b,
  author =   "Chris Okasaki",
  title ="Purely functional random-access lists",
  pages ="86--95",
  booktitle ="Conference Record of FPCA '95",
  year = 1995,
  publisher ="ACM Press",
  month =jun
}

An early paper on Auburn based on queues:

[2]

@InProceedings{mos97,
  author =   {Graeme E. Moss and Colin Runciman},
  title ={Auburn: A Kit for Benchmarking Functional Data Structures},
  booktitle ={Proceedings of IFL'97},
  volume =   1467,
  series =   {LNCS},
  month =sep,
  pages ={141--160},
  year = 1997
}

A recent paper on Auburn based on random-access lists:

[3]

@InProceedings{mos99a,
  author =   {Graeme E. Moss and Colin Runciman},
  title ={Automated Benchmarking of Functional Data Structures},
  booktitle ={Proceedings of PADL'99},
  series =   {LNCS},
  year = 1999,
  note = {To be published.}
}

Neither [2] nor [3] include references to a decision tree inducer,
though [3] contains a decision tree hand-made from the experiments of
[3] that is very similar to the tree induced recently.

Graeme.





RE: Reduction count as efficiency measure?

1998-11-24 Thread Jan Skibinski


Hi Mark,

 To paraphrase George Orwell, "All reductions are equal, but some are
 more equal than others."

:-)

So are assembly language instructions. Yet, I could think about
some average instruction size for similarly written programs.
Naive as my question might have been, I asked it anyway
in hope to receive some insight how to compare similar algorithms
in Hugs.

Well, I realize that this can be easily done by comparing
execution times of compiled programs. Yet, when I think of
incremental development of algorithms, nothing beats the interpreter.
If only there was a way to make such comparisons properly! 

You have quoted from Hugs manual:
 
 One reasonable use of the statistics produced by +s would be
 to observe general trends in the behaviour of a single algorithm
 with variations in its input.

Sure, this makes sense to me. For example, a number of reductions
during a lookup for k-th list element, as in ([0..]!!k, can be
approximated by
14*k + 14 
So here is some trend indeed. And linear too!

Now, when I substitute a standard list by one of the random
access lists of Chris Okasaki, say BinaryList, there is also
a trend:

number of reductions = alpha*log k + initialization,
where
initialization = 14 * r + 45
and r is a size of this list ( minimum list size 10 is needed
for this to be more or less true).

Contrary to what Chris is suggesting, this random access list
does not look so efficient, because of the initialization needed:
this list must be first entirely defined before a lookup can
be attempted, while the standart list might be lazily defined
as infinitive structure. Of ourse, if a function that uses the
random access list does a lot of lookups then the savings will be
apparent. But for a casual usage the standard lists seem to be more
efficient.

The question remains: is the Orwellian law true here? Am I
comparing apples to oranges? Maybe, but what choice do I have
in Hugs? Should I use the timer functionality? But there is
a note in "timer.c" that discourages it too. (The note is appended
here). I wonder whether the strong wording of that note
has a real justification or is it just a formal disclaim? 

Yet, when I take two such measures: the reduction count and
the timing - both show the same trend.
A better example would be a comparison of several implementations
of the scalar product of two vectors with Double components,
since it can be considered a primitive for other numerical
algorithms. The data is provided for the vectors of size 100.

  Container Uses indexing? Reductions  Time (milliseconds)
---
1. Array Yes 8,064   120
2. List  Yes   102,930   830 (GC occured)   
3. List  No  1,42180
4. BinaryListYes20,641   230
5. BinaryListNo  5,755   120
 
I somehow have a feeling that #3 is the most efficient method here.
But since this is still a guess I will rephrase my original question:

If neither the reduction count nor the timing are appriopriate
measures of efficiency in Hugs, then what is? Is there any
profiling tool available for the interpreter?

Jan

 

 
- Excerpt from timer.c in Hugs source code distribution --

"This file provides a simple mechanism for measuring elapsed time on
Unix machines (more precisely, on any machine with an rusage() function).
A somewhat limited version for other systems is also included, believed
to be ANSI compatible, but not guaranteed ...

It is included in the Hugs distribution for the purpose of benchmarking
the Hugs interpreter, comparing its performance across a variety of
different machines, and with other systems for similar languages.

To make use of these functions, use the --enable-timer when configuring
Hugs or change the setting of "WANT_TIMER" in config.h and recompile
Hugs.

It would be somewhat foolish to try to use the timings produced
in this way for anything other than the purpose described above.
In particular, using timings to compare the performance of different
versions of an algorithm is likely to give very misleading results.
The current implementation of Hugs as an interpreter, without any
significant optimizations, means that there are much more significant
overheads than can be accounted for by small variations in Hugs code."


 









Re: Reduction count as efficiency measure?

1998-11-24 Thread Lennart Augustsson


   which of those data structures would give me the best
   response time?
There is no simple answer to that question. It depends on how you
use it and what implementation you're going to use.
Set up a typical usage scenario and test it on the platform
you are going to use it on, that's the only way I know.
Trying to guess from the reduction count in Hugs can lead you wrong.

   -- Lennart





Re: Reduction count as efficiency measure?

1998-11-24 Thread Lennart Augustsson


 So are assembly language instructions. Yet, I could think about
 some average instruction size for similarly written programs.
Do you mean `time' rather than `size'?
If you do, then you can get rather wrong results when considering
assembly language since the concept of timing an individual
instruction is almost meaningless on modern machines.
(And if you do manage to compute the cycle count of an instruction
sequence then caching can play very ugly tricks on you.)
But, still, it works rather well in assembly language and you
won't be off by more than a constant factor.

In the worst case you can be very wrong with reduction count in
Hugs since there is no upper bound on the execution time of
an individual reduction.

I'm not sure what you're trying to accomplish.  If you want to
decide from a theoretical point of view which is the better
algorithm then you should do something more abstract.
If you're trying to decide which algorithm is better in a
certain situation then you should time it in that context.
Comparing reduction counts in Hugs will not help you much
if you ultimately want to run your code compiled with e.g. hbc.
Something which is better with one compiler can be worse
on another.

   -- Lennart





Re: Reduction count as efficiency measure?

1998-11-24 Thread Jan Skibinski



On Tue, 24 Nov 1998, Lennart Augustsson wrote:

 
  So are assembly language instructions. Yet, I could think about
  some average instruction size for similarly written programs.
 Do you mean `time' rather than `size'?

Sorry, I meant 'time'.
..
 I'm not sure what you're trying to accomplish.  If you want to
 decide from a theoretical point of view which is the better
 algorithm then you should do something more abstract.
 If you're trying to decide which algorithm is better in a
 certain situation then you should time it in that context.
 Comparing reduction counts in Hugs will not help you much
 if you ultimately want to run your code compiled with e.g. hbc.
 Something which is better with one compiler can be worse
 on another.
 

I am simply trying to choose the best tool for certain
class of algorithms. 
Given a choice of Array, List and one of the varieties
of Random Accees List; and assuming that I am not doing
anything really stupid implementation-wise, such as using
indexing for Lists; the practical question is:
which of those data structures would give me the best
response time?

Jan

 





Re: Reduction count as efficiency measure?

1998-11-23 Thread Lennart Augustsson


   May I at least assume the implication:
   reduction_count1  reduction_count2 == time1  time2 ?
I'm not sure how the reduction count is done in Hugs,
but if it simply counts G-machine instructions then
you can't assume even this.  It's likely to be true,
but G-machine instructions can vary quite a lot in
execution time.

  Lennart





RE: Reduction count as efficiency measure?

1998-11-23 Thread Mark P Jones

Hi Jan,

|   Could anyone explain what is a relation between
|   number of reductions and execution time of any
|   given Haskell function? Is there any linear
|   relationship between the two on any given machine
|   and compiler/interpreter?

Absolutely, definitely not.  To paraphrase George Orwell, "All
reductions are equal, but some are more equal than others."

|   May I at least assume the implication:
|   reduction_count1  reduction_count2 == time1  time2 ?
|   when assessing efficiency of two similar algorithms
|   running in Hugs?

No, that is not a valid assumption.  This issue is actually discussed
in the Hugs manual; I'll attach an extract of that below.  The reduction
count is still a useful statistic in some situations, but not for answering
the questions that you are interested in here.

All the best,
Mark


--- From the Hugs user manual:
Note that the statistics produced by +s are an extremely crude measure
of the behaviour of a program, and can easily be misinterpreted. For
example:

  o The fact that one expression requires more reductions than another
does not necessarily mean that the first is slower; some reductions
require much more work than others, and it may be that the average
cost of reductions in the first expression is much lower than the
average for the second.

  o The cell count does not give any information about residency, which
is the number of cells that are being used at any given time. For
example, it does not distinguish between computations that run in
constant space and computations with residency proportional to the
size of the input.

One reasonable use of the statistics produced by +s would be to observe
general trends in the behaviour of a single algorithm with variations in its
input.





Reduction count as efficiency measure?

1998-11-20 Thread Jan Skibinski


Could anyone explain what is a relation between
number of reductions and execution time of any
given Haskell function? Is there any linear
relationship between the two on any given machine
and compiler/interpreter?

May I at least assume the implication:
reduction_count1  reduction_count2 == time1  time2 ?
when assessing efficiency of two similar algorithms
running in Hugs?

Jan