subject:"\[Haskell\-cafe\] To my boss\: The code is cool, but it is about 100 times slower than the old one..."

Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-12-02 Thread Ivan Salazar

I see.

I read some chapters from Purely Functional Data Structures when I was in
college in order to understand some tree algorithms, but not the whole book.
Do you think that could help me to understand performance problems with
code (poorly) written in Haskell?

From reading your post, I can guess the book is a good start, but what
about Haskell/GHC specific hacks?
Is there a good written source for gathering the required knowledge?

For example, I remember I struggled a lot with Arrays once, and just days
after digging I found that Arrays are a no-no in Haskell. That was before
finding this maiking list, though.

On Dec 1, 2012 9:18 PM, wren ng thornton w...@freegeek.org wrote:

On 11/29/12 2:17 PM, Ivan Salazar wrote:

The bad side is that direct translation of algorithms are almost always
very slow and the work needed to make them perform is very mind bending.

Indeed. The thing is, all algorithms make (implicit) assumptions about
the cost model of the underlying language. The problem comes about because
the assumptions made in most algorithms are not valid in Haskell. This
isn't just an issue of laziness; the entire computational model of Haskell
(e.g., STG) is radically different from imperative languages.

For example, in many cases just passing arguments around (a la the State
monad) is much faster than using ST and explicit mutation. GHC does a lot
of clever code reorganization, but mutation breaks countless purity laws
and so it inhibits the application of these optimizations. Unfortunately,
the vast majority of algorithms out there assume you're working in a
language where mutation is essentially free. I'm not talking about any
significant theoretical issues about using mutation or not; I'm just
talking about the implicit assumptions that algorithm implementers make. If
you believe mutation is essentially free then it makes sense to create an
initial object and then incrementally mutate it this way and that until you
get to where you want. But, a great deal of the time there's nothing
actually stopping us from gathering all the information and allocating the
final result directly. However, if you're not used to thinking in those
terms, then the conceptual reorganization required to see how to gather all
the data at once is indeed mind-bending.

--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
On Dec 1, 2012 9:18 PM, wren ng thornton w...@freegeek.org wrote:

On 11/29/12 2:17 PM, Ivan Salazar wrote:

The bad side is that direct translation of algorithms are almost always
very slow and the work needed to make them perform is very mind bending.

Indeed. The thing is, all algorithms make (implicit) assumptions about the
cost model of the underlying language. The problem comes about because the
assumptions made in most algorithms are not valid in Haskell. This isn't
just an issue of laziness; the entire computational model of Haskell (e.g.,
STG) is radically different from imperative languages.

--
Live well,
~wren

__**_
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-12-01 Thread wren ng thornton


On 11/29/12 2:17 PM, Ivan Salazar wrote:

The bad side is that direct translation of algorithms are almost always
very slow and the work needed to make them perform is very mind bending.


Indeed. The thing is, all algorithms make (implicit) assumptions about 
the cost model of the underlying language. The problem comes about 
because the assumptions made in most algorithms are not valid in 
Haskell. This isn't just an issue of laziness; the entire computational 
model of Haskell (e.g., STG) is radically different from imperative 
languages.


For example, in many cases just passing arguments around (a la the State 
monad) is much faster than using ST and explicit mutation. GHC does a 
lot of clever code reorganization, but mutation breaks countless purity 
laws and so it inhibits the application of these optimizations. 
Unfortunately, the vast majority of algorithms out there assume you're 
working in a language where mutation is essentially free. I'm not 
talking about any significant theoretical issues about using mutation or 
not; I'm just talking about the implicit assumptions that algorithm 
implementers make. If you believe mutation is essentially free then it 
makes sense to create an initial object and then incrementally mutate it 
this way and that until you get to where you want. But, a great deal of 
the time there's nothing actually stopping us from gathering all the 
information and allocating the final result directly. However, if you're 
not used to thinking in those terms, then the conceptual reorganization 
required to see how to gather all the data at once is indeed mind-bending.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-11-29 Thread Fixie Fixie

Hi all haskellers

I every now and then get the feeling that doing my job code in Haskell would be
a good idea.

I have tried a couple of times, but each time I seem to run into performance
problems - I do lots of heavy computing.

The problem seems to be connected to lazy loading, which makes my programs so
slow that I really can not show them to anyone. I have tried all tricks in the
books, like !, seq, non-lazy datatypes...

I was poking around to see if this had changed, then I ran into this forum
post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow

The last solution was a haskell program which was in the 3x range to C, which I
think is ok. This was in the days of ghc 7.0

I then tried compile the programs myself (ghc 7.4.1), but found that now the C
program now was more that 100x faster. The ghc code was compiled with both O2
and O3, giving only small differences on my 64-bit Linux box.

So it seems something has changed - and even small examples are still not safe
when it comes to the lazy-monster. It reminds me of some code I read a couple
of years ago where one of the Simons actually fired off a new thread, to make
sure a variable was realized.

A sad thing, since I am More that willing to go for Haskell if proves to be
usable. If anyone can see what is wrong with the code (there are two haskell
versions on the page, I have tried the last and fastest one) it would also be
interesting.

What is your experience, dear haskellers? To me it seems this beautiful
language is useless without a better lazy/eager-analyzer.

Cheers,

Felix___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-11-29 Thread Alfredo Di Napoli

Hi there,
I'm only an amateur so just my 2 cent: Haskell can be really fast, but
reaching that speed can be all but trivial: you need to use different data
types (e.g. ByteString vs. the normal String type) relies on
unconventional IO (e.g. Conduit, Iterateee) and still be ready to go out
of the base, using packages and functions wich are not in base/haskell
platform (e.g. mwc-random).

My 2 cents :)
A.

On 29 November 2012 18:09, Fixie Fixie fixie.fi...@rocketmail.com wrote:

Hi all haskellers

I every now and then get the feeling that doing my job code in Haskell
would be a good idea.

I have tried a couple of times, but each time I seem to run into
performance problems - I do lots of heavy computing.

The problem seems to be connected to lazy loading, which makes my programs
so slow that I really can not show them to anyone. I have tried all tricks
in the books, like !, seq, non-lazy datatypes...

I was poking around to see if this had changed, then I ran into this forum
post:
http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow

The last solution was a haskell program which was in the 3x range to C,
which I think is ok. This was in the days of ghc 7.0

I then tried compile the programs myself (ghc 7.4.1), but found that now
the C program now was more that 100x faster. The ghc code was compiled with
both O2 and O3, giving only small differences on my 64-bit Linux box.

So it seems something has changed - and even small examples are still not
safe when it comes to the lazy-monster. It reminds me of some code I read a
couple of years ago where one of the Simons actually fired off a new
thread, to make sure a variable was realized.

A sad thing, since I am More that willing to go for Haskell if proves to
be usable. If anyone can see what is wrong with the code (there are two
haskell versions on the page, I have tried the last and fastest one) it
would also be interesting.

What is your experience, dear haskellers? To me it seems this beautiful
language is useless without a better lazy/eager-analyzer.

Cheers,

Felix

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-11-29 Thread Johan Tibell

Hi Felix,

On Thu, Nov 29, 2012 at 10:09 AM, Fixie Fixie
fixie.fi...@rocketmail.com wrote:
The problem seems to be connected to lazy loading, which makes my programs
so slow that I really can not show them to anyone. I have tried all tricks
in the books, like !, seq, non-lazy datatypes...

My advice usually goes like this:

1. Use standard, high-performance libraries (I've made a list of high
quality libraries at
https://github.com/tibbe/haskell-docs/blob/master/libraries-directory.md).
2. Make your data type fields strict.
3. Unpack primitive types (e.g. Int, Word, Float, Double).
4. Reduce allocation in tight loops (e.g. avoid creating lots of
intermediate lists).

I always do 1-3, but only do 4 when it's really necessary (e.g. in the
inner loop of some machine learning algorithm).

Rarely is performance issues due to lacking bang patterns on functions
(although there are cases, e.g. when writing recursive functions with
accumulators, where you need one).

I was poking around to see if this had changed, then I ran into this forum
post:
http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow

The last solution was a haskell program which was in the 3x range to C,
which I think is ok. This was in the days of ghc 7.0

I then tried compile the programs myself (ghc 7.4.1), but found that now the
C program now was more that 100x faster. The ghc code was compiled with both
O2 and O3, giving only small differences on my 64-bit Linux box.

So it seems something has changed - and even small examples are still not
safe when it comes to the lazy-monster. It reminds me of some code I read a
couple of years ago where one of the Simons actually fired off a new thread,
to make sure a variable was realized.

Note that the issues in the blog post are not due to laziness (i.e.
there are no space leaks), but due to the code being more polymorphic
than the C code, causing extra allocation and indirection.

What is your experience, dear haskellers? To me it seems this beautiful
language is useless without a better lazy/eager-analyzer.

It's definitely possible to write fast Haskell code (as some Haskell
programmers manage to do so consistently), but I appreciate that it's
harder than it should be. In my opinion the major thing missing is a
good text on how to write fast Haskell code and some tweaks to the
compiler (e.g. unbox strict primitive fields like Int by default).

Hope this helps.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-11-29 Thread Ivan Salazar

I hear you, my friend.

What I love of Haskell is that a lot of algorithms are very clean to
express and understand compared to, say, Lisp or C. Compared to Lisp,
function manipulation is also very clean (even compared to Racket). A great
plus is also type inference.

The bad side is that direct translation of algorithms are almost always
very slow and the work needed to make them perform is very mind bending. I
sometimes feel as if I am doing the work of a compiler/optimizer and wonder
if a computer program can do this translations.

In the correct hands (the Simons, Don Stewart, etc) I am sure Haskell is a
killer, but in hands of lowly, imperative mortals it is a time bomb. A good
example of this is narrated in Clarke's
Superiorityhttp://www.mayofamily.com/RLM/txt_Clarke_Superiority.html
.

I am a total Haskell newbie, so please ignore me if my opinion sounds
stupid, but I think I read once that Peyton-Jones said that the Haskell of
the future will be strict.

Thanks!

2012/11/29 Fixie Fixie fixie.fi...@rocketmail.com