Re: [Haskell-cafe] A tale of Project Euler
On Fri, 30 Nov 2007, Daniel Fischer wrote: Am Freitag, 30. November 2007 14:39 schrieb Henning Thielemann: Is this thread still about the prime sieve? As I mentioned, I think one can avoid the mutable array, because if there is only a small number of array updates with much changes per update, it should be efficient enough to copy the array per update. I think in this case it's far less efficient than in-place update. Consider sieving primes up to 10^7, there are 446 primes with p^2 10^7, so you would have over 400 array updates. Even if you leave out the even numbers, an array going up to 10^7 would require some 800 KB memory (each odd number one bit), so overall, you'd allocate well over 300 MB (not at once, of course). not at once - that's the point. With some luck there are only two pieces of memory where the data is copied back and forth. It's obviously not the most efficient solution, but a non-monadic solution. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler [ST]
Bulat Ziganshin wrote: Hello Andrew, Friday, November 30, 2007, 12:10:16 AM, you wrote: I don't understand the ST monad. From what I can tell, it's not definable without using strange language extensions. (I don't really like using things where it's unclear why it works.) this extension used only to guarantee referential transparency, so you don't need to understand it to use ST if you still want, it's not that hard - rank-2 forall type used here to guarantee that code executed by runST is compatible with *any* return type which makes it impossible to return innards of this computation and therefore forces your code to be referential transparent runST may be defined without rank-2 type. it doesn't change anything in its low-level working (this rank-2 type is just empty value, like RealWorld), but allows you to break referential transparency: main = do a - runST (do a - newArray return a) ... x - runST (do x - readArray a return x) ... low-level implementation of all ST-bound operations is just direct call to equivalent IO operation: newSTArray = unsafeIOtoST newIOArray readSTArray = unsafeIOtoST readIOArray where unsafeIOtoST - just code-less cast operation Mmm, so basically it's rank-2 types I don't understand. ;-) Well anyway, given the multiple replies I've had, I now have some idea how this stuff works... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Am Freitag, 30. November 2007 14:39 schrieb Henning Thielemann: Is this thread still about the prime sieve? As I mentioned, I think one can avoid the mutable array, because if there is only a small number of array updates with much changes per update, it should be efficient enough to copy the array per update. I think in this case it's far less efficient than in-place update. Consider sieving primes up to 10^7, there are 446 primes with p^2 10^7, so you would have over 400 array updates. Even if you leave out the even numbers, an array going up to 10^7 would require some 800 KB memory (each odd number one bit), so overall, you'd allocate well over 300 MB (not at once, of course). Using an STUArray runs easily within 1MB allocated once. And why avoid mutable arrays in the first place? What's bad about thing = runSTUArray (do work)? Granted, work tends to be far less elegant code than list comprehensions c, but also much faster. Cheers, Daniel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Sebastian Sylvan wrote: On Nov 29, 2007 9:10 PM, Andrew Coppin [EMAIL PROTECTED] wrote: How do you avoid accidentally recomputing the list multiple times? What do you mean? It's exactly the same as your original program but with ST instead of IO? Why would it get accidentally recomputed in this scenario and not before? Because before the moment when it gets executed relative to the main I/O thread is explicitly defined, and now it isn't. I don't see Data.Array.Base documented anywhere. (How did you know it exists?) Hmm.. I don't remember. But now you know too! :-) I think it may be a secret GHC library that you're not supposed to know about... Hmm. Secret... library... How do you guys find out about all this stuff? (And if it's secret, should we be playing with it?) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 30, 2007 5:39 PM, Andrew Coppin [EMAIL PROTECTED] wrote: Hmm. Secret... library... How do you guys find out about all this stuff? There's http://www.haskell.org/haskellwiki/Arrays#Unsafe_indexing.2C_freezing.2Fthawing.2C_running_over_array_elements . Cheers, -- Felipe. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Daniel Fischer wrote: One thing: since You check the array bounds, the system needn't check them again, use unsafeWrite and unsafeRead. And use Int for the index, that would be MUCH faster. I can't find the functions you're talking about. I have however changed the index type. (Make little or no noticable speed difference. But then, it's already pretty damn fast in the first place...) Another thing: you can stop sieving when p*p size, another speedup Saves a few hundred milliseconds. Fifth thing: better use an STUArray, don't drag IO in if it's not necessary. I don't understand the ST monad. The obvious downside of course is that you must know how big to make the array at the start of your program, and you must compute the entire array before you can use any of it. Both limitations not precent in the lazy recursive list implementation... The size of the array can easily be estimated by the prime number theorem, Probably. But it doesn't compare to the elegance and simplicity of being able to generate an arbitrary number of primes on an as-needed basis. Keep enjoying Project Euler, Daniel Well, I don't know about enjoy - generally I just find it quite frustrating. It's kind of addictive trying to increase your score though... (Anybody here reached 100% yet?) I find that a lot of the problems don't seem to involve much math. (E.g., here is some data, process it into a list of integers, find the highest one. Where is the deep mathematics in that?) A few of them do, but a lot are simply to do with prime numbers, and as far as I'm away, it is impossible to know anything about prime numbers. In other words, you must compute them all the hard way. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote: Daniel Fischer wrote: One thing: since You check the array bounds, the system needn't check them again, use unsafeWrite and unsafeRead. And use Int for the index, that would be MUCH faster. I can't find the functions you're talking about. I have however changed the index type. (Make little or no noticable speed difference. But then, it's already pretty damn fast in the first place...) Another thing: you can stop sieving when p*p size, another speedup Saves a few hundred milliseconds. Fifth thing: better use an STUArray, don't drag IO in if it's not necessary. I don't understand the ST monad. There's not a whole lot to understand if you just want to use it (though it's all very cool from a theoretical standpoint too). Here are my minor changes to your program. import Data.Array.ST import Control.Monad.ST calc_primes :: [Word64] calc_primes = runST ( do grid - newArray (2,size) True seive 2 grid ) where seive :: Word64 - STUArray s Word64 Bool - ST s [Word64] seive p g = do mapM_ (\n - writeArray g n False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64) next p g = do if p == size then return Nothing else do t - readArray g p if t then return (Just p) else next (p+1) g The benefit should be obvious: No pesky IO type, so you can use it in your pure code. You just need to give a type signature somewhere to show the type system that you're using STUArray, but the rest just uses the same type class as you already used for the IOUArrays. And here are the modifications to use the unsafe reads and writes (42% speedup for me): import Data.Array.Base size = 101 :: Word64 calc_primes :: [Word64] calc_primes = runST ( do grid - newArray (2,size) True seive 2 grid ) where seive :: Word64 - STUArray s Word64 Bool - ST s [Word64] seive p g = do mapM_ (\n - unsafeWrite g (fromIntegral n) False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64) next p g = do if p == size then return Nothing else do t - unsafeRead g (fromIntegral p) if t then return (Just p) else next (p+1) g -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Sebastian Sylvan wrote: On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote: I don't understand the ST monad. There's not a whole lot to understand if you just want to use it (though it's all very cool from a theoretical standpoint too). From what I can tell, it's not definable without using strange language extensions. (I don't really like using things where it's unclear why it works.) Here are my minor changes to your program. import Data.Array.ST import Control.Monad.ST calc_primes :: [Word64] calc_primes = runST ( do grid - newArray (2,size) True seive 2 grid ) where seive :: Word64 - STUArray s Word64 Bool - ST s [Word64] seive p g = do mapM_ (\n - writeArray g n False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64) next p g = do if p == size then return Nothing else do t - readArray g p if t then return (Just p) else next (p+1) g The benefit should be obvious: No pesky IO type, so you can use it in your pure code. You just need to give a type signature somewhere to show the type system that you're using STUArray, but the rest just uses the same type class as you already used for the IOUArrays. How do you avoid accidentally recomputing the list multiple times? And here are the modifications to use the unsafe reads and writes (42% speedup for me): import Data.Array.Base size = 101 :: Word64 calc_primes :: [Word64] calc_primes = runST ( do grid - newArray (2,size) True seive 2 grid ) where seive :: Word64 - STUArray s Word64 Bool - ST s [Word64] seive p g = do mapM_ (\n - unsafeWrite g (fromIntegral n) False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64) next p g = do if p == size then return Nothing else do t - unsafeRead g (fromIntegral p) if t then return (Just p) else next (p+1) g I don't see Data.Array.Base documented anywhere. (How did you know it exists?) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Am Donnerstag, 29. November 2007 19:43 schrieb Andrew Coppin: Daniel Fischer wrote: One thing: since You check the array bounds, the system needn't check them again, use unsafeWrite and unsafeRead. And use Int for the index, that would be MUCH faster. I can't find the functions you're talking about. As Sebastian already said, they're in Data.Array.Base, sorry, should've mentioned that. I have however changed the index type. (Make little or no noticable speed difference. But then, it's already pretty damn fast in the first place...) But it can be considerably faster. On my 5 yo 1200MHz thing, I can sieve up to 10 million in less than half a second: [EMAIL PROTECTED]:~/Documents/haskell/move time ./Sieve2 1000 664579 real0m0.419s user0m0.420s sys 0m0.000s And that's even faster than any sieve in C that I managed to write. (Okay, I don't know how to use bitsets in C, that might beat Haskell) Another thing: you can stop sieving when p*p size, another speedup Saves a few hundred milliseconds. Which is pretty much for a programme running below 1 second. Fifth thing: better use an STUArray, don't drag IO in if it's not necessary. I don't understand the ST monad. Nor do I. Just use it (and be prepared to use scoped type variables). The obvious downside of course is that you must know how big to make the array at the start of your program, and you must compute the entire array before you can use any of it. Both limitations not precent in the lazy recursive list implementation... The size of the array can easily be estimated by the prime number theorem, Probably. But it doesn't compare to the elegance and simplicity of being able to generate an arbitrary number of primes on an as-needed basis. True. But when speed matters, you have to bite the bullet. And if you don't know up to which bound you need the primes, but know it's a not too low bound, you can sieve the first part in an array and then go on using a list, that's what I do. Keep enjoying Project Euler, Daniel Well, I don't know about enjoy - generally I just find it quite frustrating. It's kind of addictive trying to increase your score though... (Anybody here reached 100% yet?) int-e (Bertram Felgenhauer, I believe), well he has not yet solved last week's, so currently he's at 99%, and there were a couple of others but they didn't persevere. I find that a lot of the problems don't seem to involve much math. (E.g., here is some data, process it into a list of integers, find the highest one. Where is the deep mathematics in that?) A few of them do, but a lot are simply to do with prime numbers, and as far as I'm away, it is impossible to know anything about prime numbers. In other words, you must compute them all the hard way. Some of the problems require little math, but more programming techniques. Others require a good grip of maths to find an efficient algorithm. Really deep maths can't be required, the site is not meant for professional mathematicians but for laypeople with mathematical interest. The aim is to make brute force unattractive (brute forcing times from 5 hours upwards), while a mathematical insight will often give the solution in a few nanoseconds. Many problems require both, some maths to find a good algorithm, and some programming techniques to make it run really fast. Cheers, Daniel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Thu, Nov 29, 2007 at 09:10:16PM +, Andrew Coppin wrote: Sebastian Sylvan wrote: On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote: I don't understand the ST monad. There's not a whole lot to understand if you just want to use it (though it's all very cool from a theoretical standpoint too). From what I can tell, it's not definable without using strange language extensions. (I don't really like using things where it's unclear why it works.) You can't implement IO in H98 either. You just have to accept it as a given. (As far as ST goes, runST is unsafePerformIO renamed. The only tricky bit is proving safety.) Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 29, 2007, at 6:19 PM, Stefan O'Rear wrote: On Thu, Nov 29, 2007 at 09:10:16PM +, Andrew Coppin wrote: Sebastian Sylvan wrote: On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote: I don't understand the ST monad. ...[and ST uses language extensions Andrew doesn't understand.] (As far as ST goes, runST is unsafePerformIO renamed. The only tricky bit is proving safety.) To put it another way, runST is unsafePerformIO where somebody has already done the safety proof for you (so you know it's 100% safe). The strange extensions are simply a device to make the safety proof work. Indeed, if you drop the extensions it can all be made to work (just say runST :: ST () a - a) but you lose the safety proof and it's equivalent to unsafePerformIO. -Jan [The trick used in runST is one of my all-time favorite bits of type theory, and is what convinced me we wanted second-order types back before the first Haskell workshop.] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Sebastian Sylvan: primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps Your definition gives a strange meaning to primeFactors. I'd want that for all n, product (primeFactors n) == n. I think this law holds for the code posted by Olivier. Of course I'd beautify his definition slightly by writing primes = 2 : filter isPrime [3,5..] isPrime = null . drop 1 . primeFactors primeFactors n | n = 2 = factor primes n factor pr@(p:ps) n | p*p n = [n] | r == 0= p : factor pr q | otherwise = factor ps n where (q,r) = quotRem n p Kalman -- Get a free email account with anti spam protection. http://www.bluebottle.com/tag/2 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 28, 2007 12:12 PM, Kalman Noel [EMAIL PROTECTED] wrote: Sebastian Sylvan: primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps Your definition gives a strange meaning to primeFactors. I'd want that for all n, product (primeFactors n) == n. I think this law holds for the code posted by Olivier. Yes you're right. That is property should clearly hold. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Sebastian Sylvan wrote: On Nov 28, 2007 12:12 PM, Kalman Noel [EMAIL PROTECTED] wrote: Sebastian Sylvan: primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps Your definition gives a strange meaning to primeFactors. I'd want that for all n, product (primeFactors n) == n. I think this law holds for the code posted by Olivier. Yes you're right. That is property should clearly hold. There are problems for which it's important to know how many times a given prime factor occurs. And there are other problems where it is merely necessary to know which primes are factors. I would say it's useful to have *both* functions available... By the way, I'm now using a new and much uglier prime seive: size = 100 :: Word64 calc_primes :: IO [Word64] calc_primes = do grid - newArray (2,size) True seive 2 grid where seive :: Word64 - IOUArray Word64 Bool - IO [Word64] seive p g = do mapM_ (\n - writeArray g n False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - IOUArray Word64 Bool - IO (Maybe Word64) next p g = do if p == size then return Nothing else do t - readArray g p if t then return (Just p) else next (p+1) g Seems moderately fast. At least, I can find the 10,001st prime in under 1 second... The obvious downside of course is that you must know how big to make the array at the start of your program, and you must compute the entire array before you can use any of it. Both limitations not precent in the lazy recursive list implementation... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Michaeljohn Clement wrote: Andrew Coppin wrote: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; Um, I can't *believe* nobody else pointed this out, but that isn't C, it's C++. Really? How can you tell? Well anyway, the guy who wrote it said it's C. I suppose it was a simple oversight... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Am Mittwoch, 28. November 2007 22:31 schrieb Andrew Coppin: There are problems for which it's important to know how many times a given prime factor occurs. And there are other problems where it is merely necessary to know which primes are factors. I would say it's useful to have *both* functions available... Yup By the way, I'm now using a new and much uglier prime seive: size = 100 :: Word64 calc_primes :: IO [Word64] calc_primes = do grid - newArray (2,size) True seive 2 grid where seive :: Word64 - IOUArray Word64 Bool - IO [Word64] seive p g = do mapM_ (\n - writeArray g n False) [p, 2*p .. size] mp' - next (p+1) g case mp' of Nothing - return [p] Just p' - do ps - seive p' g return (p:ps) next :: Word64 - IOUArray Word64 Bool - IO (Maybe Word64) next p g = do if p == size then return Nothing else do t - readArray g p if t then return (Just p) else next (p+1) g Seems moderately fast. At least, I can find the 10,001st prime in under 1 second... One thing: since You check the array bounds, the system needn't check them again, use unsafeWrite and unsafeRead. And use Int for the index, that would be MUCH faster. Another thing: you can stop sieving when p*p size, another speedup Third thing: In my experience mapM_ is relatively slow, hand coded loops are faster (though much uglier), but YMMV Fourth thing: sieve only odd numbers Fifth thing: better use an STUArray, don't drag IO in if it's not necessary. The obvious downside of course is that you must know how big to make the array at the start of your program, and you must compute the entire array before you can use any of it. Both limitations not precent in the lazy recursive list implementation... The size of the array can easily be estimated by the prime number theorem, pi(x) ~ x/log x, where pi(x) is the number of primes not exceeding x, so for 10,001 primes, find x with x/log x about 10,000, add a bit for safety, a bound of 120,000 will do. If you want the n-th prime, you don't want to use the smaller primes anyway, if you need all primes, chances are that prime generation will take negligible time in comparison to your main algo anyway. Keep enjoying Project Euler, Daniel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 28, 2007 9:28 PM, Andrew Coppin [EMAIL PROTECTED] wrote: Michaeljohn Clement wrote: Andrew Coppin wrote: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; Um, I can't *believe* nobody else pointed this out, but that isn't C, it's C++. Really? How can you tell? cout, endl etc. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 28, 2007, at 16:28 , Andrew Coppin wrote: Michaeljohn Clement wrote: Andrew Coppin wrote: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; Um, I can't *believe* nobody else pointed this out, but that isn't C, it's C++. Really? How can you tell? Strictly speaking, the I/O is done with C++ operators and variables. But the actual algorithm is C, if you replace the cout ... with printf() then the whole thing will be C. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 27, 2007 2:34 PM, Andrew Coppin [EMAIL PROTECTED] wrote: Hi guys. Somebody just introduced me to a thing called Project Euler. I gather it's well known around here... Anyway, I was a little bit concerned about problem #7. The problem is basically figure out what the 10,001st prime number is. Consider the following two programs for solving this: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; for(m=2*n;mMAX_NUMBERS;m+=n) prime[m]=false; } Second, my implementation in Haskell: primes :: [Integer] primes = seive [2..] where seive (p:xs) = p : seive (filter (\x - x `mod` p 0) xs) main = print (primes !! 1) Well, we know who's winning the beauty contest. But here's the worrying thing: C program: 0.016 seconds Haskell program: 41 seconds Eeeps! o_O That's Not Good(tm). Replacing primes :: [Integer] with primes :: [Word32] speeds the Haskell version up to 18 seconds, which is much faster but still a joke compared to C. Running the Haskell code on a 2.2 GHz AMD Athlon64 X2 instead of a 1.5 GHz AMD Athlon drops the execution time down to 3 seconds. (So clearly the box I was using in my lunch break at work is *seriously* slow... presumably the PC running the C code isn't.) So, now I have a Haskell version that's only several hundred times slower. Neither program is especially optimised, yet the C version is drastically faster. This makes me sad. :-( By the way... I tried using Data.List.Stream instead. This made the program about 40% slower. I also tried both -fasm and -fvia-c. The difference was statistically insignificant. (Hey guys, nice work on the native codegen!) The difference in compilation speed was fairly drastic though, needless to say. (The machine is kinda low on RAM, so trying to call GCC makes it thrash like hell. So does linking, actually...) Also, I'm stuck with problem #10. (Find the sum of all primes less than 1 million.) I've let the program run for well over 15 minutes, and still no answer is forthcomming. It's implemented using the same primes function as above, with a simple filter and sum. (The type has to be changed to [Word64] to avoid overflowing the result though.) The other guy claims to have a C solution that takes 12 ms. There's a hell of a difference between 12 ms and over half an hour...(!) Clearly something is horribly wrong here. Uh... help? The algorithm you use to compute primes is actually quite inefficient, and in particular, it is NOT the same algorithm that the C program is using, despite first appearances! The call to 'filter' in the sieve function works by checking *every* number for divisibility by p, and only keeping those that aren't divisible by p; whereas the C program simply marks multiples of p as non-prime, skipping over those which are not multiples. So the Haskell version basically searches for primes by doing trial division on every single integer, whereas the C version is actually using a sieve. For more information you might want to read this paper, which includes a much better primes implementation: www.cs.hmc.edu/~*oneill*/papers/*Sieve*-JFP.pdf In fact, this very same topic was discussed on the list a while back, you can read it here: http://thread.gmane.org/gmane.comp.lang.haskell.cafe/19699/focus=19798 -Brent ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
andrewcoppin: Hi guys. Somebody just introduced me to a thing called Project Euler. I gather it's well known around here... Anyway, I was a little bit concerned about problem #7. The problem is basically figure out what the 10,001st prime number is. Consider the following two programs for solving this: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; for(m=2*n;mMAX_NUMBERS;m+=n) prime[m]=false; } Second, my implementation in Haskell: primes :: [Integer] primes = seive [2..] where seive (p:xs) = p : seive (filter (\x - x `mod` p 0) xs) main = print (primes !! 1) This is an FAQ. Unless you use the same algorithm and data types in benchmarks, you're not really benchmarking anything. And expecting one of the worst possible algorithms to be good is hoping for a little too much :) When you do use similar data types and algorithms, you get similar performance: http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebitslang=all I'll reiterate: use the same data structures and algorithms, if you want the same performance. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Tue, 27 Nov 2007, Andrew Coppin wrote: So, now I have a Haskell version that's only several hundred times slower. Neither program is especially optimised, yet the C version is drastically faster. This makes me sad. :-( I think the C version is so much faster because it does not need allocations in the inner loop. If you rewrite your program using a mutable array your program should become faster - and more ugly. However it might be fast enough if you create a new array for each run over the array with a new prime. Then you do not need mutable arrays. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On 11/27/07, Andrew Coppin [EMAIL PROTECTED] wrote: Hi guys. Somebody just introduced me to a thing called Project Euler. I gather it's well known around here... Anyway, I was a little bit concerned about problem #7. The problem is basically figure out what the 10,001st prime number is. Consider the following two programs for solving this: First, somebody else wrote this in C: int n = 2 , m , primesFound = 0; for( n=0;n MAX_NUMBERS;n++ ) if( prime[n] ) { primesFound++; if( primesFound == 10001 ) cout n is the 10001st prime. endl; for(m=2*n;mMAX_NUMBERS;m+=n) prime[m]=false; } Second, my implementation in Haskell: primes :: [Integer] primes = seive [2..] where seive (p:xs) = p : seive (filter (\x - x `mod` p 0) xs) main = print (primes !! 1) Well, we know who's winning the beauty contest. But here's the worrying thing: C program: 0.016 seconds Haskell program: 41 seconds Eeeps! o_O That's Not Good(tm). Replacing primes :: [Integer] with primes :: [Word32] speeds the Haskell version up to 18 seconds, which is much faster but still a joke compared to C. Running the Haskell code on a 2.2 GHz AMD Athlon64 X2 instead of a 1.5 GHz AMD Athlon drops the execution time down to 3 seconds. (So clearly the box I was using in my lunch break at work is *seriously* slow... presumably the PC running the C code isn't.) So, now I have a Haskell version that's only several hundred times slower. Neither program is especially optimised, yet the C version is drastically faster. This makes me sad. :-( By the way... I tried using Data.List.Stream instead. This made the program about 40% slower. I also tried both -fasm and -fvia-c. The difference was statistically insignificant. (Hey guys, nice work on the native codegen!) The difference in compilation speed was fairly drastic though, needless to say. (The machine is kinda low on RAM, so trying to call GCC makes it thrash like hell. So does linking, actually...) Also, I'm stuck with problem #10. (Find the sum of all primes less than 1 million.) I've let the program run for well over 15 minutes, and still no answer is forthcomming. It's implemented using the same primes function as above, with a simple filter and sum. (The type has to be changed to [Word64] to avoid overflowing the result though.) The other guy claims to have a C solution that takes 12 ms. There's a hell of a difference between 12 ms and over half an hour...(!) Clearly something is horribly wrong here. Uh... help? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Hi Andrew, I don't remember who I stole this prime computation from, but it is very fast (10001's prime in 0.06 sec with Int and 0.2 with Integer on my machine) and not overly complex: primes :: [Integer] primes = 2 : filter (l1 . primeFactors) [3,5..] where l1 (_:[]) = True l1 _ = False primeFactors :: Integer - [Integer] primeFactors n = factor n primes where factor _ [] = [] factor m (p:ps) | p*p m= [m] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps I used it a lot while playing with Euler Project. Many of the problems require prime calculation. Cheers, Olivier. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Andrew Coppin wrote: Also, I'm stuck with problem #10. (Find the sum of all primes less than 1 million.) I've let the program run for well over 15 minutes, and still no answer is forthcomming. It's implemented using the same primes function as above, with a simple filter and sum. (The type has to be changed to [Word64] to avoid overflowing the result though.) The other guy claims to have a C solution that takes 12 ms. There's a hell of a difference between 12 ms and over half an hour...(!) Clearly something is horribly wrong here. Uh... help? I just let it run to completion. Took 25 minutes and 15 seconds to find the (correct) answer. I would have preferred it to take 12 ms... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 27, 2007 8:54 PM, Olivier Boudry [EMAIL PROTECTED] wrote: Hi Andrew, I don't remember who I stole this prime computation from, but it is very fast (10001's prime in 0.06 sec with Int and 0.2 with Integer on my machine) and not overly complex: primes :: [Integer] primes = 2 : filter (l1 . primeFactors) [3,5..] where l1 (_:[]) = True l1 _ = False primeFactors :: Integer - [Integer] primeFactors n = factor n primes where factor _ [] = [] factor m (p:ps) | p*p m= [m] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps I used it a lot while playing with Euler Project. Many of the problems require prime calculation. That is indeed a nice and clear version that's pretty fast. It's basically the same as the C version except backwards (i.e. examine a number and work backwards through its divisors, rather than filling in a map of all multiples of a known prime). Let me suggest the following slight modification (primeFactors in your version doesn't actually return prime factors - it returns prime factors *or* a list of just the number itself), primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps This is roughly 35% faster on my machine with GHC 6.7.20070730 too, but the point wasn't to make it faster, but clearer. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On Nov 27, 2007 8:44 PM, Andrew Coppin [EMAIL PROTECTED] wrote: Andrew Coppin wrote: Also, I'm stuck with problem #10. (Find the sum of all primes less than 1 million.) I've let the program run for well over 15 minutes, and still no answer is forthcomming. It's implemented using the same primes function as above, with a simple filter and sum. (The type has to be changed to [Word64] to avoid overflowing the result though.) The other guy claims to have a C solution that takes 12 ms. There's a hell of a difference between 12 ms and over half an hour...(!) Clearly something is horribly wrong here. Uh... help? I just let it run to completion. Took 25 minutes and 15 seconds to find the (correct) answer. I would have preferred it to take 12 ms... FWIW the following code took 0.77s on my machine: primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps main = print ( sum ( takeWhile ( 100) primes ) ) -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On the other hand, I must relay to you how much fun I had with certain other problems. For example, problem #12. I started with this: triangles = scanl1 (+) [1..] divisors n = length $ filter (\x - n `mod` x == 0) [1..n] answer = head $ dropWhile (\n - divisors n 500) triangles Sadly, this is *absurdly* slow. I gave up after about 5 minutes of waiting. It had only scanned up to T[1,200]. Then I tried the following: triangles = scanl1 (+) [1..] primes :: [Word32] primes = seive [2..] where seive (p:xs) = p : seive (filter (\x - x `mod` p 0) xs) factors = work primes where work (p:ps) n = if p n then [] else if n `mod` p == 0 then p : work (p:ps) (n `div` p) else work ps n count_factors n = let fs = factors n fss = group fs in product $ map ((1+) . length) fss answer = head $ dropWhile (\n - count_factors n 500) triangles By looking only at *prime* divisors and then figuring out how many divisors there are in total (you don't even have to *compute* what they are, just *count* them!), I was able to get the correct solution in UNDER ONE SECOND! o_O :-D :^] Now how about that for fast, eh? (Actually, having solved it I discovered there's an even better way - apparently there's a relationship between the number of divisors of consecutive triangular numbers. I'm too noobish to understand the number theory here...) Similarly, problem 24 neatly illustrates everything that is sweet and pure about Haskell: choose [x] = [(x,[])] choose (x:xs) = (x,xs) : map (\(y,ys) - (y,x:ys)) (choose xs) permutations [x] = [[x]] permutations xs = do (x1,xs1) - choose xs xs2 - permutations xs1 return (x1 : xs2) answer = (permutations 0123456789) !! 99 This finds the answer in less than 3 seconds. And it is beautiful to behold. Yes, there is apparently some more sophisticated algorithm than actually enumerating the permutations. But I love the way I threw code together in the most intuitive way that fell into my head, and got a fast, performant answer. Perfection! :-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Andrew Coppin wrote: On the other hand, I must relay to you how much fun I had with certain other problems. You may want to look at: http://haskell.org/haskellwiki/Euler_problems and make some contributions. But be very careful what you peek at, so don't spoil your own fun. Regards, Yitz ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
On 11/27/07, Sebastian Sylvan [EMAIL PROTECTED] wrote: That is indeed a nice and clear version that's pretty fast. It's basically the same as the C version except backwards (i.e. examine a number and work backwards through its divisors, rather than filling in a map of all multiples of a known prime). Let me suggest the following slight modification (primeFactors in your version doesn't actually return prime factors - it returns prime factors *or* a list of just the number itself), primes :: [Integer] primes = 2 : filter (null . primeFactors) [3,5..] primeFactors :: Integer- [Integer] primeFactors n = factor n primes where factor m (p:ps) | p*p m= [] | m `mod` p == 0 = p : factor (m `div` p) (p:ps) | otherwise = factor m ps This is roughly 35% faster on my machine with GHC 6.7.20070730 too, but the point wasn't to make it faster, but clearer. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 Great remark, it's even simpler like this. By the way I just found the article I stole this algorithm from: http://www.haskell.org/haskellwiki/99_questions/31_to_41 last one in problem 35. Cheers, Olivier. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Brent Yorgey wrote: The algorithm you use to compute primes is actually quite inefficient, and in particular, it is NOT the same algorithm that the C program is using, despite first appearances! The call to 'filter' in the sieve function works by checking *every* number for divisibility by p, and only keeping those that aren't divisible by p; whereas the C program simply marks multiples of p as non-prime, skipping over those which are not multiples. So the Haskell version basically searches for primes by doing trial division on every single integer, whereas the C version is actually using a sieve. I had to reread that several times to figure out what you're actually saying. So the Haskell version tests all N elements to see if P divides them, whereas the C version loops over an array (roughly) N / P times flagging the composite numbers, which is why it's so much damn faster. (Since as P grows, N / P plummets.) OK, cool. So... now how do I implement this without using mutable arrays? :-} PS. The *original* prime number thing I had was much slower. Used an is_prime function to do trial division by *every* number below N. (!!) The version I showed as much faster than that, but still quite slow, as you say... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A tale of Project Euler
Don Stewart wrote: This is an FAQ. Unless you use the same algorithm and data types in benchmarks, you're not really benchmarking anything. And expecting one of the worst possible algorithms to be good is hoping for a little too much :) Well, if I was comparing my Haskell against some expert-optimised C library then yeah. What puzzled me is that both programs are unoptimised and appear to be using *the same* algorithm, and yet the performance is incomparable. Now that Brent has spoken, I understand the error of my ways. ;-) When you do use similar data types and algorithms, you get similar performance. One would hope so, yes... :-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe