[Haskell-cafe] Re: Haskell performance (again)!

2006-10-10 Thread Jón Fairbairn
Brian Hulley [EMAIL PROTECTED] writes:

 Lennart Augustsson wrote:
  I think your first try looks good.
 [snip]
  ...
  addPoly1 p1@(p1h@(Nom p1c p1d):p1t) p2@(p2h@(Nom p2c p2d):p2t)
 | p1d == p2d = Nom (p1c + p2c) p1d : addPoly1 p1t p2t
 | p1d  p2d = p1h : addPoly1 p1t p2
 | p1d  p2d = p2h : addPoly1 p1 p2t
  ...
 
 The last comparison is redundant (this was in the original
 version too) because p1d  p2d is implied (certainly for
 this case where p1d, p2d::Int) by the fall through from not
 satisfying == and  so how about:
 
 addPoly1 p1@(p1h@(Nom p1c p1d):p1t) p2@(p2h@(Nom p2c p2d):p2t)
 | p1d == p2d = Nom (p1c + p2c) p1d : addPoly1 p1t p2t
 | p1d  p2d = p1h : addPoly1 p1t p2
 | otherwise = p2h : addPoly1 p1 p2t

Surely all but one of the comparisons is unnecessary? If you
use `compare` instead of (==) and friends, won't one do (I'm
assuming that the compiler can convert cases on LT, EQ and
GT into something sensible -- after all, wasn't that the
purpose of compare?)?

-- 
Jón Fairbairn [EMAIL PROTECTED]
http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2006-09-13)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell performance (again)!

2006-10-10 Thread Lennart Augustsson


On Oct 10, 2006, at 14:58 , Jón Fairbairn wrote:


Bulat Ziganshin [EMAIL PROTECTED] writes:


Hello Jon,

Tuesday, October 10, 2006, 1:18:52 PM, you wrote:


Surely all but one of the comparisons is unnecessary? If you
use `compare` instead of (==) and friends, won't one do (I'm
assuming that the compiler can convert cases on LT, EQ and
GT into something sensible -- after all, wasn't that the
purpose of compare?)?


it will too smart for GHC. actual code is:

compareInt# :: Int# - Int# - Ordering
compareInt# x# y#
| x# #  y# = LT
| x# ==# y# = EQ
| otherwise = GT


But once that's been inlined and through whatever code
generator, what then? If it doesn't get turned into one test
on the data and conditional jumps on sign bits, something
isn't doing a thorough job...


Assuming your machine architecture supports something like condition  
codes.  On, e.g., the MIPS you would need to test for  and ==  
separately.


-- Lennart

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell performance (again)!

2006-10-10 Thread Stefan Monnier
 Assuming your machine architecture supports something like condition  codes.
 On, e.g., the MIPS you would need to test for  and ==  separately.

And even if your machine supports condition codes, you'll need one test plus
two conditional jumps.  Not much better than MIPS's 2 independent tests plus
2 conditional jumps.


Stefan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Haskell performance (again)!

2006-10-09 Thread Cale Gibbard

On 09/10/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Cale Gibbard wrote:
 I might also like to point out that by small and large, we're
 actually referring to the number of ways in which components of the
 datastructure can be computed separately, which tends to correspond
 nicely to how one usually pictures small and large pieces of data.

I'm not sure what you mean. I thought it is appropriate to count the
number of approximations to a given value v, i.e. count how many v'
there are with
v'  v
in the usual CPO ordering, that is Cons _|_ [1,2]  Cons 3 [1,2] etc.
That count is either large or small.


Yeah, that's exactly what I'm talking about. :)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Haskell performance (again)!

2006-10-08 Thread apfelmus
Hello,

admittedly, there is a lack of material on lazy evaluation and
performance. IMHO, the current wiki(book) and other articles are
somewhat inadequate which stems from the fact that current rumor is
strictness is fast and arrays must be unboxed or so. I don't agree
with this, so I post some remarks about performance in general and with
laziness in particular.


My first remark about performance is unfair and amounts to discuss the
issue away:
* The programmers viewpoint about performance in Haskell should be a
lazy one, i.e. you think about the performance of your code only when
its forced by someone else. If no one complains, do not even waste a
second thinking about it.

So

 type Poly = [(Int,Int)]

 addPoly1 :: Poly - Poly - Poly
 addPoly1 p1@(p1h@(p1c,p1d):p1t) p2@(p2h@(p2c,p2d):p2t)
| p1d == p2d = (p1c + p2c, p1d) : addPoly1 p1t p2t
| p1d  p2d = p1h : addPoly1 p1t p2
| p1d  p2d = p2h : addPoly1 p1 p2t
 addPoly1 p1 [] = p1
 addPoly1 [] p2 = p2
 addPoly1 [] [] = []

is the right thing to do. Point. With that viewpoint, your sorrows will
fade away :) Mine do.

In my experience, clean code by far outweighs any performance issues as
it allows one to reduce the complexity of the programming task until it
actually becomes tractable. As example, Darcs' patch mechanism is almost
impossible to code in a non-functional language. And what about those
open source / freeware game web sites where one reads at some point my
code base became absolutely unmaintainable and I had to abandon this
project ?

Linked to that is the advent of scripting languages like perl, python,
tcl. Why do people use these interpreted languages as there are the
coffins of performance? IMHO, there also already is the trend to buy
medium abstraction (Java, Objective-C) by deliberately selling away
performance (Java programs are so lame, Objective-C can be speed up 10%
by optimizing the msg_send() assembly) and i think: Haskell has a much
better benefit-to-cost ratio then the others.


The second remark is:
* On machine level, Laziness is an overhead, but it's only a constant
factor. *Constant factor*!

There is a story about this in S. Skiena's The Algorithm Design Manual
where some guy eats all the CPU of a supercomputer for a stupid task
which he programmed using an asymptotically mediocre algorithm. Ouch.

This already applies to your polynomials: are you sure [(Int,Int)] is
the right data structure to fulfill your performance requirements on the
asymptotic running time of your intended operation? Or should you use
  Data.Map Int Int
which grants O(log n) random access (by index i) to every coefficient
a_i as opposed to the list case where you can only fetch a_i in O(n) time?
The answer is that your original representation is quite suitable as the
common operations +,*,diff,evaluation etc. for polynomials have no a
priori knowledge about which coefficients are 0 and which are not, they
have to discover it anyway. Only internally * needs a finite map to
collect common degrees. You can even write a function sum ::
[Polynomial] - Polynomial which uses a finite map to collect degrees
and define + and * in terms of it.

By the way, I don't know any imperative hobby programmer for which
mutable arrays are not the one and for all collection data structure.
Haskell offers a much easier access to data structures like binary
search trees with appropriate polymorphism. IMHO, Haskell is even the
first language to have a binary search tree library which offers
satisfying generality and ease of use. This makes your program faster!


Only the third remark starts to be about performance in Haskell itself:
* Haskell is a lazy language. Why to make this your enemy, why to swim
against the currents by making your programs more strict?

As I remember, Cale once said on IRC how the relative strength of
strictness / laziness applies on functions which operate on the
different data sizes:
small - smalldoesn't really matter
  (minor lazy overhead but we have
   strictness analysis)
large - small
  every part of the large thing is needed in calculation
  strictness wins
large - small
  only parts are needed
  lazyness wins
large - largedepends on large input like above
  but roughly same otherwise
small - largeroughly same

Only in the case where strictness wins, you have to insert a seq or two.
 As a rule of thumb, all recursive data structures like lists and trees
are large. Int's are small. The point is that when you are given an Int,
the first look at it will reveal all information about it. In case of a
list, a first look only reveals the first element.

What was the primary reason for laziness, anyway? The famous paper Why
functional programming matters by John Hughes gives the answer: you can
code programs in a more modular way (and this without biting performance
penalties). Laziness is what makes compositions like large - large -
small (large