Re: Reduction count as efficiency measure?
Bart Demoen wrote: Simon wrote: CSE can also make space consumption asymptotically worse. I can understand this when garbage collection is around and because of CSE, some data structures don't get cleaned up; but is it also true without gc ? If you don't use a rule like e'[e,e] -- let x = e in e'[x,x] for CSE, but limit yourself to rules like let x = e in e'[e] -- let x = e in e'[x] then you are right that the transformed program will not allocate more closures than the old (in fact less in most cases). However, the liveness of heap cells will (often) increase and hence they cannot be collected by garbage collection as early as in the original program. Just the elimination of a few common subexpressions can increase heap residency of a program considerably. If want to know more about why I think that CSE is unfortunately not suitable for lazy functional languages have a look at my paper `Common Subexpressions are Uncommon in Lazy Functional Languages' at http://www-i2.informatik.rwth-aachen.de/~chitil/PUBLICATIONS/comSubsAreUncom.html Olaf -- OLAF CHITIL, Lehrstuhl fuer Informatik II, RWTH Aachen, 52056 Aachen, Germany Tel: (+49/0)241/80-21212; Fax: (+49/0)241/-217 URL: http://www-i2.informatik.rwth-aachen.de/~chitil/
RE: Reduction count as efficiency measure?
I'm still curious about my first question, though, about the specific optimizations included in ghc and hbc. If in fact they don't do CSE, are there optimizations which they do perform which would change the asymptotic running time? GHC doesn't do CSE, and this is part of the reason... generally I don't think that compilers should change asymptotic running time. CSE can also make space consumption asymptotically worse. Having said that, full laziness (which GHC does do) can change asymptotic running time. Consider f n = let g m = h n + m in sum (map g [1..n]) where h is linear in n. The running time is n**2, but if you lift out the invariant (h n) from the inside of g, you get back to linear. Simon
Re: Reduction count as efficiency measure?
Keith Wansbrough [EMAIL PROTECTED] writes: So while Hugs gives you a reduction count (or even a millisecond duration), this is essentially meaningless: in a real application you would compile the code with an optimising compiler. The effect this can have on your execution time can easily be more than merely a constant factor: it can change the order of your algorithm. Is this true in practice? That is, are there programs which have different asymptotic running times when compiled under ghc or hbc than when running under Hugs? It would actually surprise me if there were; I'm having a hard time imagining a realistic optimization that would do this. (Which could easily be a failure of my imagination.) Carl Witty [EMAIL PROTECTED]
Re: Reduction count as efficiency measure?
On 24-Nov-1998, Jan Skibinski [EMAIL PROTECTED] wrote: On Tue, 24 Nov 1998, Lennart Augustsson wrote: I'm not sure what you're trying to accomplish. If you want to decide from a theoretical point of view which is the better algorithm then you should do something more abstract. If you're trying to decide which algorithm is better in a certain situation then you should time it in that context. Comparing reduction counts in Hugs will not help you much if you ultimately want to run your code compiled with e.g. hbc. Something which is better with one compiler can be worse on another. I am simply trying to choose the best tool for certain class of algorithms. Given a choice of Array, List and one of the varieties of Random Accees List; and assuming that I am not doing anything really stupid implementation-wise, such as using indexing for Lists; the practical question is: which of those data structures would give me the best response time? I suspect that array performance in Haskell is quite likely to be strongly dependent on compiler optimizations. Using Hugs for performance comparisons may (or may not) give you reasonable results for comparing List and Random Access List, but for comparisons with Array I'd advise you use something that is close to what you will use for the final implementation. -- Fergus Henderson [EMAIL PROTECTED] | "Binaries may die WWW: http://www.cs.mu.oz.au/~fjh | but source code lives forever" PGP: finger [EMAIL PROTECTED]| -- leaked Microsoft memo.
Re: Reduction count as efficiency measure?
| Is this true in practice? That is, are there programs which have | different asymptotic running times when compiled under ghc or hbc than | when running under Hugs? | | It would actually surprise me if there were; I'm having a hard time | imagining a realistic optimization that would do this. (Which could | easily be a failure of my imagination.) What about common subexpression elimination? AFAIK neither Hugs nor GHC nor HBC implement CSE, but if they did the asymptotic running time could sometimes radically change. Here is a nice example several first-year students came up with when I asked them to program `maximum'. maximum [a] = a maximum (a : as) = if a maximum as then maximum as else a Thus implemented `maximum' has an exponential running time. Eliminating the double call `maximum as' yields a linear implementation. maximum [a] = a maximum (a : as) = let m = maximum as in if a m then m else a Does this example convince you? Cheers, Ralf
Re: Reduction count as efficiency measure?
Ralf Hinze [EMAIL PROTECTED] writes: | Is this true in practice? That is, are there programs which have | different asymptotic running times when compiled under ghc or hbc than | when running under Hugs? | | It would actually surprise me if there were; I'm having a hard time | imagining a realistic optimization that would do this. (Which could | easily be a failure of my imagination.) What about common subexpression elimination? AFAIK neither Hugs nor GHC nor HBC implement CSE, but if they did the asymptotic running time could sometimes radically change. OK, so my imagination is severely lacking. :-) I'm still curious about my first question, though, about the specific optimizations included in ghc and hbc. If in fact they don't do CSE, are there optimizations which they do perform which would change the asymptotic running time? Carl Witty [EMAIL PROTECTED]
RE: Reduction count as efficiency measure?
At 05:12 -0600 1998/11/24, Jan Skibinski wrote: To paraphrase George Orwell, "All reductions are equal, but some are more equal than others." ... So are assembly language instructions. Yet, I could think about some average instruction size for similarly written programs. Naive as my question might have been, I asked it anyway in hope to receive some insight how to compare similar algorithms in Hugs. ... If neither the reduction count nor the timing are appriopriate measures of efficiency in Hugs, then what is? Is there any profiling tool available for the interpreter? Since modern CPU's are developed as to make more commonly used assembler instructions faster, the only way to find out the speed of the components of a program is to use a profiler. Hans Aberg * Email: Hans Aberg mailto:[EMAIL PROTECTED] * Home Page: http://www.matematik.su.se/~haberg/ * AMS member listing: http://www.ams.org/cml/
RE: Reduction count as efficiency measure?
If neither the reduction count nor the timing are appriopriate measures of efficiency in Hugs, then what is? Is there any profiling tool available for the interpreter? Since modern CPU's are developed as to make more commonly used assembler instructions faster, the only way to find out the speed of the components of a program is to use a profiler. Looks like you missed my last question: "Is there any profiling tool available for the interpreter?" I meant: "for Hugs" but if HBI has some I would gladly use it. Jan
RE: Reduction count as efficiency measure?
At 09:49 -0600 1998/11/24, Jan Skibinski wrote: Since modern CPU's are developed as to make more commonly used assembler instructions faster, the only way to find out the speed of the components of a program is to use a profiler. Looks like you missed my last question: "Is there any profiling tool available for the interpreter?" I meant: "for Hugs" but if HBI has some I would gladly use it. This is what I meant, too: A profiler for Hugs that allows one to check the time of the Haskell program components it interprets. What did you think I meant? Hans Aberg * Email: Hans Aberg mailto:[EMAIL PROTECTED] * Home Page: http://www.matematik.su.se/~haberg/ * AMS member listing: http://www.ams.org/cml/
Re: Reduction count as efficiency measure?
...(continued)... Sorry, I forgot to include my references: [1] @InProceedings{oka95b, author = "Chris Okasaki", title ="Purely functional random-access lists", pages ="86--95", booktitle ="Conference Record of FPCA '95", year = 1995, publisher ="ACM Press", month =jun } An early paper on Auburn based on queues: [2] @InProceedings{mos97, author = {Graeme E. Moss and Colin Runciman}, title ={Auburn: A Kit for Benchmarking Functional Data Structures}, booktitle ={Proceedings of IFL'97}, volume = 1467, series = {LNCS}, month =sep, pages ={141--160}, year = 1997 } A recent paper on Auburn based on random-access lists: [3] @InProceedings{mos99a, author = {Graeme E. Moss and Colin Runciman}, title ={Automated Benchmarking of Functional Data Structures}, booktitle ={Proceedings of PADL'99}, series = {LNCS}, year = 1999, note = {To be published.} } Neither [2] nor [3] include references to a decision tree inducer, though [3] contains a decision tree hand-made from the experiments of [3] that is very similar to the tree induced recently. Graeme.
RE: Reduction count as efficiency measure?
Hi Mark, To paraphrase George Orwell, "All reductions are equal, but some are more equal than others." :-) So are assembly language instructions. Yet, I could think about some average instruction size for similarly written programs. Naive as my question might have been, I asked it anyway in hope to receive some insight how to compare similar algorithms in Hugs. Well, I realize that this can be easily done by comparing execution times of compiled programs. Yet, when I think of incremental development of algorithms, nothing beats the interpreter. If only there was a way to make such comparisons properly! You have quoted from Hugs manual: One reasonable use of the statistics produced by +s would be to observe general trends in the behaviour of a single algorithm with variations in its input. Sure, this makes sense to me. For example, a number of reductions during a lookup for k-th list element, as in ([0..]!!k, can be approximated by 14*k + 14 So here is some trend indeed. And linear too! Now, when I substitute a standard list by one of the random access lists of Chris Okasaki, say BinaryList, there is also a trend: number of reductions = alpha*log k + initialization, where initialization = 14 * r + 45 and r is a size of this list ( minimum list size 10 is needed for this to be more or less true). Contrary to what Chris is suggesting, this random access list does not look so efficient, because of the initialization needed: this list must be first entirely defined before a lookup can be attempted, while the standart list might be lazily defined as infinitive structure. Of ourse, if a function that uses the random access list does a lot of lookups then the savings will be apparent. But for a casual usage the standard lists seem to be more efficient. The question remains: is the Orwellian law true here? Am I comparing apples to oranges? Maybe, but what choice do I have in Hugs? Should I use the timer functionality? But there is a note in "timer.c" that discourages it too. (The note is appended here). I wonder whether the strong wording of that note has a real justification or is it just a formal disclaim? Yet, when I take two such measures: the reduction count and the timing - both show the same trend. A better example would be a comparison of several implementations of the scalar product of two vectors with Double components, since it can be considered a primitive for other numerical algorithms. The data is provided for the vectors of size 100. Container Uses indexing? Reductions Time (milliseconds) --- 1. Array Yes 8,064 120 2. List Yes 102,930 830 (GC occured) 3. List No 1,42180 4. BinaryListYes20,641 230 5. BinaryListNo 5,755 120 I somehow have a feeling that #3 is the most efficient method here. But since this is still a guess I will rephrase my original question: If neither the reduction count nor the timing are appriopriate measures of efficiency in Hugs, then what is? Is there any profiling tool available for the interpreter? Jan - Excerpt from timer.c in Hugs source code distribution -- "This file provides a simple mechanism for measuring elapsed time on Unix machines (more precisely, on any machine with an rusage() function). A somewhat limited version for other systems is also included, believed to be ANSI compatible, but not guaranteed ... It is included in the Hugs distribution for the purpose of benchmarking the Hugs interpreter, comparing its performance across a variety of different machines, and with other systems for similar languages. To make use of these functions, use the --enable-timer when configuring Hugs or change the setting of "WANT_TIMER" in config.h and recompile Hugs. It would be somewhat foolish to try to use the timings produced in this way for anything other than the purpose described above. In particular, using timings to compare the performance of different versions of an algorithm is likely to give very misleading results. The current implementation of Hugs as an interpreter, without any significant optimizations, means that there are much more significant overheads than can be accounted for by small variations in Hugs code."
Re: Reduction count as efficiency measure?
which of those data structures would give me the best response time? There is no simple answer to that question. It depends on how you use it and what implementation you're going to use. Set up a typical usage scenario and test it on the platform you are going to use it on, that's the only way I know. Trying to guess from the reduction count in Hugs can lead you wrong. -- Lennart
Re: Reduction count as efficiency measure?
So are assembly language instructions. Yet, I could think about some average instruction size for similarly written programs. Do you mean `time' rather than `size'? If you do, then you can get rather wrong results when considering assembly language since the concept of timing an individual instruction is almost meaningless on modern machines. (And if you do manage to compute the cycle count of an instruction sequence then caching can play very ugly tricks on you.) But, still, it works rather well in assembly language and you won't be off by more than a constant factor. In the worst case you can be very wrong with reduction count in Hugs since there is no upper bound on the execution time of an individual reduction. I'm not sure what you're trying to accomplish. If you want to decide from a theoretical point of view which is the better algorithm then you should do something more abstract. If you're trying to decide which algorithm is better in a certain situation then you should time it in that context. Comparing reduction counts in Hugs will not help you much if you ultimately want to run your code compiled with e.g. hbc. Something which is better with one compiler can be worse on another. -- Lennart
Re: Reduction count as efficiency measure?
On Tue, 24 Nov 1998, Lennart Augustsson wrote: So are assembly language instructions. Yet, I could think about some average instruction size for similarly written programs. Do you mean `time' rather than `size'? Sorry, I meant 'time'. .. I'm not sure what you're trying to accomplish. If you want to decide from a theoretical point of view which is the better algorithm then you should do something more abstract. If you're trying to decide which algorithm is better in a certain situation then you should time it in that context. Comparing reduction counts in Hugs will not help you much if you ultimately want to run your code compiled with e.g. hbc. Something which is better with one compiler can be worse on another. I am simply trying to choose the best tool for certain class of algorithms. Given a choice of Array, List and one of the varieties of Random Accees List; and assuming that I am not doing anything really stupid implementation-wise, such as using indexing for Lists; the practical question is: which of those data structures would give me the best response time? Jan
Re: Reduction count as efficiency measure?
May I at least assume the implication: reduction_count1 reduction_count2 == time1 time2 ? I'm not sure how the reduction count is done in Hugs, but if it simply counts G-machine instructions then you can't assume even this. It's likely to be true, but G-machine instructions can vary quite a lot in execution time. Lennart
RE: Reduction count as efficiency measure?
Hi Jan, | Could anyone explain what is a relation between | number of reductions and execution time of any | given Haskell function? Is there any linear | relationship between the two on any given machine | and compiler/interpreter? Absolutely, definitely not. To paraphrase George Orwell, "All reductions are equal, but some are more equal than others." | May I at least assume the implication: | reduction_count1 reduction_count2 == time1 time2 ? | when assessing efficiency of two similar algorithms | running in Hugs? No, that is not a valid assumption. This issue is actually discussed in the Hugs manual; I'll attach an extract of that below. The reduction count is still a useful statistic in some situations, but not for answering the questions that you are interested in here. All the best, Mark --- From the Hugs user manual: Note that the statistics produced by +s are an extremely crude measure of the behaviour of a program, and can easily be misinterpreted. For example: o The fact that one expression requires more reductions than another does not necessarily mean that the first is slower; some reductions require much more work than others, and it may be that the average cost of reductions in the first expression is much lower than the average for the second. o The cell count does not give any information about residency, which is the number of cells that are being used at any given time. For example, it does not distinguish between computations that run in constant space and computations with residency proportional to the size of the input. One reasonable use of the statistics produced by +s would be to observe general trends in the behaviour of a single algorithm with variations in its input.
Reduction count as efficiency measure?
Could anyone explain what is a relation between number of reductions and execution time of any given Haskell function? Is there any linear relationship between the two on any given machine and compiler/interpreter? May I at least assume the implication: reduction_count1 reduction_count2 == time1 time2 ? when assessing efficiency of two similar algorithms running in Hugs? Jan