Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
I see. I read some chapters from Purely Functional Data Structures when I was in college in order to understand some tree algorithms, but not the whole book. Do you think that could help me to understand performance problems with code (poorly) written in Haskell? From reading your post, I can guess the book is a good start, but what about Haskell/GHC specific hacks? Is there a good written source for gathering the required knowledge? For example, I remember I struggled a lot with Arrays once, and just days after digging I found that Arrays are a no-no in Haskell. That was before finding this maiking list, though. On Dec 1, 2012 9:18 PM, wren ng thornton w...@freegeek.org wrote: On 11/29/12 2:17 PM, Ivan Salazar wrote: The bad side is that direct translation of algorithms are almost always very slow and the work needed to make them perform is very mind bending. Indeed. The thing is, all algorithms make (implicit) assumptions about the cost model of the underlying language. The problem comes about because the assumptions made in most algorithms are not valid in Haskell. This isn't just an issue of laziness; the entire computational model of Haskell (e.g., STG) is radically different from imperative languages. For example, in many cases just passing arguments around (a la the State monad) is much faster than using ST and explicit mutation. GHC does a lot of clever code reorganization, but mutation breaks countless purity laws and so it inhibits the application of these optimizations. Unfortunately, the vast majority of algorithms out there assume you're working in a language where mutation is essentially free. I'm not talking about any significant theoretical issues about using mutation or not; I'm just talking about the implicit assumptions that algorithm implementers make. If you believe mutation is essentially free then it makes sense to create an initial object and then incrementally mutate it this way and that until you get to where you want. But, a great deal of the time there's nothing actually stopping us from gathering all the information and allocating the final result directly. However, if you're not used to thinking in those terms, then the conceptual reorganization required to see how to gather all the data at once is indeed mind-bending. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe On Dec 1, 2012 9:18 PM, wren ng thornton w...@freegeek.org wrote: On 11/29/12 2:17 PM, Ivan Salazar wrote: The bad side is that direct translation of algorithms are almost always very slow and the work needed to make them perform is very mind bending. Indeed. The thing is, all algorithms make (implicit) assumptions about the cost model of the underlying language. The problem comes about because the assumptions made in most algorithms are not valid in Haskell. This isn't just an issue of laziness; the entire computational model of Haskell (e.g., STG) is radically different from imperative languages. For example, in many cases just passing arguments around (a la the State monad) is much faster than using ST and explicit mutation. GHC does a lot of clever code reorganization, but mutation breaks countless purity laws and so it inhibits the application of these optimizations. Unfortunately, the vast majority of algorithms out there assume you're working in a language where mutation is essentially free. I'm not talking about any significant theoretical issues about using mutation or not; I'm just talking about the implicit assumptions that algorithm implementers make. If you believe mutation is essentially free then it makes sense to create an initial object and then incrementally mutate it this way and that until you get to where you want. But, a great deal of the time there's nothing actually stopping us from gathering all the information and allocating the final result directly. However, if you're not used to thinking in those terms, then the conceptual reorganization required to see how to gather all the data at once is indeed mind-bending. -- Live well, ~wren __**_ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
On 11/29/12 2:17 PM, Ivan Salazar wrote: The bad side is that direct translation of algorithms are almost always very slow and the work needed to make them perform is very mind bending. Indeed. The thing is, all algorithms make (implicit) assumptions about the cost model of the underlying language. The problem comes about because the assumptions made in most algorithms are not valid in Haskell. This isn't just an issue of laziness; the entire computational model of Haskell (e.g., STG) is radically different from imperative languages. For example, in many cases just passing arguments around (a la the State monad) is much faster than using ST and explicit mutation. GHC does a lot of clever code reorganization, but mutation breaks countless purity laws and so it inhibits the application of these optimizations. Unfortunately, the vast majority of algorithms out there assume you're working in a language where mutation is essentially free. I'm not talking about any significant theoretical issues about using mutation or not; I'm just talking about the implicit assumptions that algorithm implementers make. If you believe mutation is essentially free then it makes sense to create an initial object and then incrementally mutate it this way and that until you get to where you want. But, a great deal of the time there's nothing actually stopping us from gathering all the information and allocating the final result directly. However, if you're not used to thinking in those terms, then the conceptual reorganization required to see how to gather all the data at once is indeed mind-bending. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
Hi all haskellers I every now and then get the feeling that doing my job code in Haskell would be a good idea. I have tried a couple of times, but each time I seem to run into performance problems - I do lots of heavy computing. The problem seems to be connected to lazy loading, which makes my programs so slow that I really can not show them to anyone. I have tried all tricks in the books, like !, seq, non-lazy datatypes... I was poking around to see if this had changed, then I ran into this forum post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow The last solution was a haskell program which was in the 3x range to C, which I think is ok. This was in the days of ghc 7.0 I then tried compile the programs myself (ghc 7.4.1), but found that now the C program now was more that 100x faster. The ghc code was compiled with both O2 and O3, giving only small differences on my 64-bit Linux box. So it seems something has changed - and even small examples are still not safe when it comes to the lazy-monster. It reminds me of some code I read a couple of years ago where one of the Simons actually fired off a new thread, to make sure a variable was realized. A sad thing, since I am More that willing to go for Haskell if proves to be usable. If anyone can see what is wrong with the code (there are two haskell versions on the page, I have tried the last and fastest one) it would also be interesting. What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. Cheers, Felix___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
Hi there, I'm only an amateur so just my 2 cent: Haskell can be really fast, but reaching that speed can be all but trivial: you need to use different data types (e.g. ByteString vs. the normal String type) relies on unconventional IO (e.g. Conduit, Iterateee) and still be ready to go out of the base, using packages and functions wich are not in base/haskell platform (e.g. mwc-random). My 2 cents :) A. On 29 November 2012 18:09, Fixie Fixie fixie.fi...@rocketmail.com wrote: Hi all haskellers I every now and then get the feeling that doing my job code in Haskell would be a good idea. I have tried a couple of times, but each time I seem to run into performance problems - I do lots of heavy computing. The problem seems to be connected to lazy loading, which makes my programs so slow that I really can not show them to anyone. I have tried all tricks in the books, like !, seq, non-lazy datatypes... I was poking around to see if this had changed, then I ran into this forum post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow The last solution was a haskell program which was in the 3x range to C, which I think is ok. This was in the days of ghc 7.0 I then tried compile the programs myself (ghc 7.4.1), but found that now the C program now was more that 100x faster. The ghc code was compiled with both O2 and O3, giving only small differences on my 64-bit Linux box. So it seems something has changed - and even small examples are still not safe when it comes to the lazy-monster. It reminds me of some code I read a couple of years ago where one of the Simons actually fired off a new thread, to make sure a variable was realized. A sad thing, since I am More that willing to go for Haskell if proves to be usable. If anyone can see what is wrong with the code (there are two haskell versions on the page, I have tried the last and fastest one) it would also be interesting. What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. Cheers, Felix ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
Hi Felix, On Thu, Nov 29, 2012 at 10:09 AM, Fixie Fixie fixie.fi...@rocketmail.com wrote: The problem seems to be connected to lazy loading, which makes my programs so slow that I really can not show them to anyone. I have tried all tricks in the books, like !, seq, non-lazy datatypes... My advice usually goes like this: 1. Use standard, high-performance libraries (I've made a list of high quality libraries at https://github.com/tibbe/haskell-docs/blob/master/libraries-directory.md). 2. Make your data type fields strict. 3. Unpack primitive types (e.g. Int, Word, Float, Double). 4. Reduce allocation in tight loops (e.g. avoid creating lots of intermediate lists). I always do 1-3, but only do 4 when it's really necessary (e.g. in the inner loop of some machine learning algorithm). Rarely is performance issues due to lacking bang patterns on functions (although there are cases, e.g. when writing recursive functions with accumulators, where you need one). I was poking around to see if this had changed, then I ran into this forum post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow The last solution was a haskell program which was in the 3x range to C, which I think is ok. This was in the days of ghc 7.0 I then tried compile the programs myself (ghc 7.4.1), but found that now the C program now was more that 100x faster. The ghc code was compiled with both O2 and O3, giving only small differences on my 64-bit Linux box. So it seems something has changed - and even small examples are still not safe when it comes to the lazy-monster. It reminds me of some code I read a couple of years ago where one of the Simons actually fired off a new thread, to make sure a variable was realized. Note that the issues in the blog post are not due to laziness (i.e. there are no space leaks), but due to the code being more polymorphic than the C code, causing extra allocation and indirection. A sad thing, since I am More that willing to go for Haskell if proves to be usable. If anyone can see what is wrong with the code (there are two haskell versions on the page, I have tried the last and fastest one) it would also be interesting. What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. It's definitely possible to write fast Haskell code (as some Haskell programmers manage to do so consistently), but I appreciate that it's harder than it should be. In my opinion the major thing missing is a good text on how to write fast Haskell code and some tweaks to the compiler (e.g. unbox strict primitive fields like Int by default). Hope this helps. Cheers, Johan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
I hear you, my friend. What I love of Haskell is that a lot of algorithms are very clean to express and understand compared to, say, Lisp or C. Compared to Lisp, function manipulation is also very clean (even compared to Racket). A great plus is also type inference. The bad side is that direct translation of algorithms are almost always very slow and the work needed to make them perform is very mind bending. I sometimes feel as if I am doing the work of a compiler/optimizer and wonder if a computer program can do this translations. In the correct hands (the Simons, Don Stewart, etc) I am sure Haskell is a killer, but in hands of lowly, imperative mortals it is a time bomb. A good example of this is narrated in Clarke's Superiorityhttp://www.mayofamily.com/RLM/txt_Clarke_Superiority.html . I am a total Haskell newbie, so please ignore me if my opinion sounds stupid, but I think I read once that Peyton-Jones said that the Haskell of the future will be strict. Thanks! 2012/11/29 Fixie Fixie fixie.fi...@rocketmail.com Hi all haskellers I every now and then get the feeling that doing my job code in Haskell would be a good idea. I have tried a couple of times, but each time I seem to run into performance problems - I do lots of heavy computing. The problem seems to be connected to lazy loading, which makes my programs so slow that I really can not show them to anyone. I have tried all tricks in the books, like !, seq, non-lazy datatypes... I was poking around to see if this had changed, then I ran into this forum post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow The last solution was a haskell program which was in the 3x range to C, which I think is ok. This was in the days of ghc 7.0 I then tried compile the programs myself (ghc 7.4.1), but found that now the C program now was more that 100x faster. The ghc code was compiled with both O2 and O3, giving only small differences on my 64-bit Linux box. So it seems something has changed - and even small examples are still not safe when it comes to the lazy-monster. It reminds me of some code I read a couple of years ago where one of the Simons actually fired off a new thread, to make sure a variable was realized. A sad thing, since I am More that willing to go for Haskell if proves to be usable. If anyone can see what is wrong with the code (there are two haskell versions on the page, I have tried the last and fastest one) it would also be interesting. What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. Cheers, Felix ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
If you can give an example of some underperforming code, I'm sure someone (or several people) on this list would be more than happy to help you make it more performant. Generally, it doesn't take much. It's all in knowing where to look. Also, if you know performance is key, you should be using the performance-oriented data structures (ByteString, Text, Vector) from the very beginning. Personally, I never find myself using Data.Array, and rarely use String in real code. It's just not worth the performance headaches. And finally, depending on what you're doing, neither Haskell, nor C might be right for you! Especially with certain numerics-related code, you might find fortran, OpenCL, or CUDA easier to make performant. Examples of this would be lovely. - Clark On Thu, Nov 29, 2012 at 2:00 PM, Alfredo Di Napoli alfredo.dinap...@gmail.com wrote: Hi there, I'm only an amateur so just my 2 cent: Haskell can be really fast, but reaching that speed can be all but trivial: you need to use different data types (e.g. ByteString vs. the normal String type) relies on unconventional IO (e.g. Conduit, Iterateee) and still be ready to go out of the base, using packages and functions wich are not in base/haskell platform (e.g. mwc-random). My 2 cents :) A. On 29 November 2012 18:09, Fixie Fixie fixie.fi...@rocketmail.com wrote: Hi all haskellers I every now and then get the feeling that doing my job code in Haskell would be a good idea. I have tried a couple of times, but each time I seem to run into performance problems - I do lots of heavy computing. The problem seems to be connected to lazy loading, which makes my programs so slow that I really can not show them to anyone. I have tried all tricks in the books, like !, seq, non-lazy datatypes... I was poking around to see if this had changed, then I ran into this forum post: http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow The last solution was a haskell program which was in the 3x range to C, which I think is ok. This was in the days of ghc 7.0 I then tried compile the programs myself (ghc 7.4.1), but found that now the C program now was more that 100x faster. The ghc code was compiled with both O2 and O3, giving only small differences on my 64-bit Linux box. So it seems something has changed - and even small examples are still not safe when it comes to the lazy-monster. It reminds me of some code I read a couple of years ago where one of the Simons actually fired off a new thread, to make sure a variable was realized. A sad thing, since I am More that willing to go for Haskell if proves to be usable. If anyone can see what is wrong with the code (there are two haskell versions on the page, I have tried the last and fastest one) it would also be interesting. What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. Cheers, Felix ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
Oh, my - what an indentation :-) New try: - Videresendt melding Fra: Fixie Fixie fixie.fi...@rocketmail.com Til: haskell-cafe@haskell.org haskell-cafe@haskell.org Kopi: Clark Gaebel cgae...@uwaterloo.ca Sendt: Torsdag, 29. november 2012 20.57 Emne: Vedr: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one... Sure, the code is from the url I mentioned, both in C and in Haskell. It allready seems like the haskell-version should run fast to me - it uses unboxed arrays and even unsafe functions to make it run faster. Well, have a look: C-version: #include stdint.h #include stdio.h #define VDIM 100 #define VNUM 10 uint64_t prng (uint64_t w) { w ^= w 13; w ^= w 7; w ^= w 17; return w; }; void kahanStep (double *s, double *c, double x) { double y, t; y = x - *c; t = *s + y; *c = (t - *s) - y; *s = t; } void kahan(double s[], double c[]) { for (int i = 1; i = VNUM; i++) { uint64_t w = i; for (int j = 0; j VDIM; j++) { kahanStep(s[j], c[j], w); w = prng(w); } } }; int main (int argc, char* argv[]) { double acc[VDIM], err[VDIM]; for (int i = 0; i VDIM; i++) { acc[i] = err[i] = 0.0; }; kahan(acc, err); printf([ ); for (int i = 0; i VDIM; i++) { printf(%g , acc[i]); }; printf(]\n); }; And the haskell version: {-# LANGUAGE CPP, BangPatterns #-} module Main (main) where #define VDIM 100 #define VNUM 10 import Data.Array.Base import Data.Array.ST import Data.Array.Unboxed import Control.Monad.ST import GHC.Word import Control.Monad import Data.Bits prng :: Word - Word prng w = w' where !w1 = w `xor` (w `shiftL` 13) !w2 = w1 `xor` (w1 `shiftR` 7) !w' = w2 `xor` (w2 `shiftL` 17) type Vec s = STUArray s Int Double kahan :: Vec s - Vec s - ST s () kahan s c = do let inner w j | j VDIM = do !cj - unsafeRead c j !sj - unsafeRead s j let !y = fromIntegral w - cj !t = sj + y !w' = prng w unsafeWrite c j ((t-sj)-y) unsafeWrite s j t inner w' (j+1) | otherwise = return () forM_ [1 .. VNUM] $ \i - inner (fromIntegral i) 0 calc :: ST s (Vec s) calc = do s - newArray (0,VDIM-1) 0 c - newArray (0,VDIM-1) 0 kahan s c return s main :: IO () main = print . elems $ runSTUArray calc Cheers, Felix___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
Ack, it seems like you're running into one of these bugs (all now fixed, but I don't know in which GHC version): http://hackage.haskell.org/trac/ghc/search?q=doubleFromInteger ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
On Thu, Nov 29, 2012 at 1:00 PM, Fixie Fixie fixie.fi...@rocketmail.com wrote: The program seems to take around 6 seconds on my linux-box, while the c version goes for 0.06 sekcond. That is really some regression bug :-) Anyone with a more recent version thatn 7.4.1? On 7.4.2: $ time ./c_test ... real0m0.145s user0m0.040s sys 0m0.003s $ time ./Test ... real0m0.234s user0m0.220s sys 0m0.006s Both compiled with -O2. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
On Thu, Nov 29, 2012 at 1:23 PM, Fixie Fixie fixie.fi...@rocketmail.com wrote: That's really an argument for upgrading to 7.4.2 :-) Another reason for doing things with haskell is this mailing list. FYI I'm still looking into this issue as I'm not 100% happy with the code GHC generates. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...
On 29 November 2012 18:09, Fixie Fixie fixie.fi...@rocketmail.com wrote: What is your experience, dear haskellers? To me it seems this beautiful language is useless without a better lazy/eager-analyzer. Since when has speed been the sole arbiter of utility? 10 years ago I switched from Clean to Haskell, even though Clean was then faster (and included a good strictness analyser). The available libraries for Haskell is what swung the decision for me. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe