Re: [Haskell-cafe] A tale of Project Euler

2007-11-30 Thread Henning Thielemann

On Fri, 30 Nov 2007, Daniel Fischer wrote:

 Am Freitag, 30. November 2007 14:39 schrieb Henning Thielemann:

  Is this thread still about the prime sieve? As I mentioned, I think one
  can avoid the mutable array, because if there is only a small number of
  array updates with much changes per update, it should be efficient enough
  to copy the array per update.

 I think in this case it's far less efficient than in-place update. Consider
 sieving primes up to 10^7, there are 446 primes with p^2  10^7, so you would
 have over 400 array updates. Even if you leave out the even numbers, an array
 going up to 10^7 would require some 800 KB memory (each odd number one bit),
 so overall, you'd allocate well over 300 MB (not at once, of course).

not at once - that's the point. With some luck there are only two pieces
of memory where the data is copied back and forth. It's obviously not the
most efficient solution, but a non-monadic solution.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler [ST]

2007-11-30 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

Friday, November 30, 2007, 12:10:16 AM, you wrote:

  

I don't understand the ST monad.



  

 From what I can tell, it's not definable without using strange language
extensions. (I don't really like using things where it's unclear why it
works.)



this extension used only to guarantee referential transparency, so you
don't need to understand it to use ST

if you still want, it's not that hard - rank-2 forall type used here
to guarantee that code executed by runST is compatible with *any*
return type which makes it impossible to return innards of this
computation and therefore forces your code to be referential
transparent

runST may be defined without rank-2 type. it doesn't change anything in
its low-level working (this rank-2 type is just empty value, like
RealWorld), but allows you to break referential transparency:

main = do a - runST (do a - newArray
 return a)
  ...
  x - runST (do x - readArray a
 return x)
  ...

low-level implementation of all ST-bound operations is just direct
call to equivalent IO operation:

newSTArray = unsafeIOtoST newIOArray
readSTArray = unsafeIOtoST readIOArray

where unsafeIOtoST - just code-less cast operation
  


Mmm, so basically it's rank-2 types I don't understand. ;-)

Well anyway, given the multiple replies I've had, I now have some idea 
how this stuff works...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-30 Thread Daniel Fischer
Am Freitag, 30. November 2007 14:39 schrieb Henning Thielemann:

 Is this thread still about the prime sieve? As I mentioned, I think one
 can avoid the mutable array, because if there is only a small number of
 array updates with much changes per update, it should be efficient enough
 to copy the array per update.

I think in this case it's far less efficient than in-place update. Consider 
sieving primes up to 10^7, there are 446 primes with p^2  10^7, so you would 
have over 400 array updates. Even if you leave out the even numbers, an array 
going up to 10^7 would require some 800 KB memory (each odd number one bit), 
so overall, you'd allocate well over 300 MB (not at once, of course). Using 
an STUArray runs easily within 1MB allocated once. 
And why avoid mutable arrays in the first place?
What's bad about 
thing = runSTUArray (do work)?
Granted, work tends to be far less elegant code than list comprehensions c, 
but also much faster.

Cheers,
Daniel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-30 Thread Andrew Coppin

Sebastian Sylvan wrote:

On Nov 29, 2007 9:10 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
  

How do you avoid accidentally recomputing the list multiple times?



What do you mean? It's exactly the same as your original program but
with ST instead of IO? Why would it get accidentally recomputed in
this scenario and not before?
  


Because before the moment when it gets executed relative to the main I/O 
thread is explicitly defined, and now it isn't.



I don't see Data.Array.Base documented anywhere. (How did you know it
exists?)



Hmm.. I don't remember. But now you know too! :-)
I think it may be a secret GHC library that you're not supposed to know about...
  


Hmm. Secret... library... How do you guys find out about all this stuff? 
(And if it's secret, should we be playing with it?)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-30 Thread Felipe Lessa
On Nov 30, 2007 5:39 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
 Hmm. Secret... library... How do you guys find out about all this stuff?

There's 
http://www.haskell.org/haskellwiki/Arrays#Unsafe_indexing.2C_freezing.2Fthawing.2C_running_over_array_elements
.

Cheers,

-- 
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Andrew Coppin

Daniel Fischer wrote:
One thing: since You check the array bounds, the system needn't check them 
again, use unsafeWrite and unsafeRead. And use Int for the index, that would 
be MUCH faster.
  


I can't find the functions you're talking about. I have however changed 
the index type. (Make little or no noticable speed difference. But then, 
it's already pretty damn fast in the first place...)



Another thing: you can stop sieving when p*p  size, another speedup
  


Saves a few hundred milliseconds.


Fifth thing: better use an STUArray, don't drag IO in if it's not necessary.
  


I don't understand the ST monad.


The obvious downside of course is that you must know how big
to make the array at the start of your program, and you must compute the
entire array before you can use any of it. Both limitations not precent
in the lazy recursive list implementation...



The size of the array can easily be estimated by the prime number theorem,
  


Probably. But it doesn't compare to the elegance and simplicity of being 
able to generate an arbitrary number of primes on an as-needed basis.



Keep enjoying Project Euler,
Daniel
  


Well, I don't know about enjoy - generally I just find it quite 
frustrating. It's kind of addictive trying to increase your score 
though... (Anybody here reached 100% yet?)


I find that a lot of the problems don't seem to involve much math. 
(E.g., here is some data, process it into a list of integers, find the 
highest one. Where is the deep mathematics in that?) A few of them do, 
but a lot are simply to do with prime numbers, and as far as I'm away, 
it is impossible to know anything about prime numbers. In other words, 
you must compute them all the hard way.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Sebastian Sylvan
On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
 Daniel Fischer wrote:
  One thing: since You check the array bounds, the system needn't check them
  again, use unsafeWrite and unsafeRead. And use Int for the index, that would
  be MUCH faster.
 

 I can't find the functions you're talking about. I have however changed
 the index type. (Make little or no noticable speed difference. But then,
 it's already pretty damn fast in the first place...)

  Another thing: you can stop sieving when p*p  size, another speedup
 

 Saves a few hundred milliseconds.

  Fifth thing: better use an STUArray, don't drag IO in if it's not necessary.
 

 I don't understand the ST monad.



There's not a whole lot to understand if you just want to use it
(though it's all very cool from a theoretical standpoint too). Here
are my minor changes to your program.

import Data.Array.ST
import Control.Monad.ST


calc_primes :: [Word64]
calc_primes = runST ( do
   grid - newArray (2,size) True
   seive 2 grid )
 where
   seive :: Word64 - STUArray s Word64 Bool - ST s [Word64]
   seive p g = do
 mapM_ (\n - writeArray g n False) [p, 2*p .. size]
 mp' - next (p+1) g
 case mp' of
   Nothing - return [p]
   Just p' - do
 ps - seive p' g
 return (p:ps)

   next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64)
   next p g = do
 if p == size
   then return Nothing
   else do
 t - readArray g p
 if t
   then return (Just p)
   else next (p+1) g


The benefit should be obvious: No pesky IO type, so you can use it in
your pure code. You just need to give a type signature somewhere to
show the type system that you're using STUArray, but the rest just
uses the same type class as you already used for the IOUArrays.

And here are the modifications to use the unsafe reads and writes (42%
speedup for me):

import Data.Array.Base

size = 101 :: Word64

calc_primes :: [Word64]
calc_primes = runST ( do
   grid - newArray (2,size) True
   seive 2 grid )
 where
   seive :: Word64 - STUArray s Word64 Bool - ST s [Word64]
   seive p g = do
 mapM_ (\n - unsafeWrite g (fromIntegral n) False) [p, 2*p .. size]
 mp' - next (p+1) g
 case mp' of
   Nothing - return [p]
   Just p' - do
 ps - seive p' g
 return (p:ps)

   next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64)
   next p g = do
 if p == size
   then return Nothing
   else do
 t - unsafeRead g (fromIntegral p)
 if t
   then return (Just p)
   else next (p+1) g

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Andrew Coppin

Sebastian Sylvan wrote:

On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
  

I don't understand the ST monad.




There's not a whole lot to understand if you just want to use it
(though it's all very cool from a theoretical standpoint too).


From what I can tell, it's not definable without using strange language 
extensions. (I don't really like using things where it's unclear why it 
works.)



Here are my minor changes to your program.

import Data.Array.ST
import Control.Monad.ST


calc_primes :: [Word64]
calc_primes = runST ( do
   grid - newArray (2,size) True
   seive 2 grid )
 where
   seive :: Word64 - STUArray s Word64 Bool - ST s [Word64]
   seive p g = do
 mapM_ (\n - writeArray g n False) [p, 2*p .. size]
 mp' - next (p+1) g
 case mp' of
   Nothing - return [p]
   Just p' - do
 ps - seive p' g
 return (p:ps)

   next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64)
   next p g = do
 if p == size
   then return Nothing
   else do
 t - readArray g p
 if t
   then return (Just p)
   else next (p+1) g


The benefit should be obvious: No pesky IO type, so you can use it in
your pure code. You just need to give a type signature somewhere to
show the type system that you're using STUArray, but the rest just
uses the same type class as you already used for the IOUArrays.
  


How do you avoid accidentally recomputing the list multiple times?


And here are the modifications to use the unsafe reads and writes (42%
speedup for me):

import Data.Array.Base

size = 101 :: Word64

calc_primes :: [Word64]
calc_primes = runST ( do
   grid - newArray (2,size) True
   seive 2 grid )
 where
   seive :: Word64 - STUArray s Word64 Bool - ST s [Word64]
   seive p g = do
 mapM_ (\n - unsafeWrite g (fromIntegral n) False) [p, 2*p .. size]
 mp' - next (p+1) g
 case mp' of
   Nothing - return [p]
   Just p' - do
 ps - seive p' g
 return (p:ps)

   next :: Word64 - STUArray s Word64 Bool - ST s (Maybe Word64)
   next p g = do
 if p == size
   then return Nothing
   else do
 t - unsafeRead g (fromIntegral p)
 if t
   then return (Just p)
   else next (p+1) g

  


I don't see Data.Array.Base documented anywhere. (How did you know it 
exists?)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Daniel Fischer
Am Donnerstag, 29. November 2007 19:43 schrieb Andrew Coppin:
 Daniel Fischer wrote:
  One thing: since You check the array bounds, the system needn't check
  them again, use unsafeWrite and unsafeRead. And use Int for the index,
  that would be MUCH faster.

 I can't find the functions you're talking about.

As Sebastian already said, they're in Data.Array.Base, sorry, should've 
mentioned that.

 I have however changed
 the index type. (Make little or no noticable speed difference. But then,
 it's already pretty damn fast in the first place...)


But it can be considerably faster. On my 5 yo 1200MHz thing, I can sieve up to 
10 million in less than half a second:
[EMAIL PROTECTED]:~/Documents/haskell/move time ./Sieve2 1000
664579

real0m0.419s
user0m0.420s
sys 0m0.000s

And that's even faster than any sieve in C that I managed to write.
(Okay, I don't know how to use bitsets in C, that might beat Haskell)
  Another thing: you can stop sieving when p*p  size, another speedup

 Saves a few hundred milliseconds.

Which is pretty much for a programme running below 1 second.

  Fifth thing: better use an STUArray, don't drag IO in if it's not
  necessary.

 I don't understand the ST monad.

Nor do I. Just use it (and be prepared to use scoped type variables).

  The obvious downside of course is that you must know how big
  to make the array at the start of your program, and you must compute the
  entire array before you can use any of it. Both limitations not precent
  in the lazy recursive list implementation...
 
  The size of the array can easily be estimated by the prime number
  theorem,

 Probably. But it doesn't compare to the elegance and simplicity of being
 able to generate an arbitrary number of primes on an as-needed basis.

True. But when speed matters, you have to bite the bullet.
And if you don't know up to which bound you need the primes, but know it's a 
not too low bound, you can sieve the first part in an array and then go on 
using a list, that's what I do.

  Keep enjoying Project Euler,
  Daniel

 Well, I don't know about enjoy - generally I just find it quite
 frustrating. It's kind of addictive trying to increase your score
 though... (Anybody here reached 100% yet?)

int-e (Bertram Felgenhauer, I believe), well he has not yet solved last 
week's, so currently he's at 99%, and there were a couple of others but they 
didn't persevere.

 I find that a lot of the problems don't seem to involve much math.
 (E.g., here is some data, process it into a list of integers, find the
 highest one. Where is the deep mathematics in that?) A few of them do,
 but a lot are simply to do with prime numbers, and as far as I'm away,
 it is impossible to know anything about prime numbers. In other words,
 you must compute them all the hard way.

Some of the problems require little math, but more programming techniques. 
Others require a good grip of maths to find an efficient algorithm. Really 
deep maths can't be required, the site is not meant for professional 
mathematicians but for laypeople with mathematical interest.
The aim is to make brute force unattractive (brute forcing times from 5 hours 
upwards), while a mathematical insight will often give the solution in a few 
nanoseconds.
Many problems require both, some maths to find a good algorithm, and some 
programming techniques to make it run really fast.

Cheers,
Daniel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Stefan O'Rear
On Thu, Nov 29, 2007 at 09:10:16PM +, Andrew Coppin wrote:
 Sebastian Sylvan wrote:
 On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED] 
 wrote:
   
 I don't understand the ST monad.
 


 There's not a whole lot to understand if you just want to use it
 (though it's all very cool from a theoretical standpoint too).

 From what I can tell, it's not definable without using strange language 
 extensions. (I don't really like using things where it's unclear why it 
 works.)

You can't implement IO in H98 either.  You just have to accept it as a
given.

(As far as ST goes, runST is unsafePerformIO renamed.  The only tricky
bit is proving safety.)

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-29 Thread Jan-Willem Maessen


On Nov 29, 2007, at 6:19 PM, Stefan O'Rear wrote:


On Thu, Nov 29, 2007 at 09:10:16PM +, Andrew Coppin wrote:

Sebastian Sylvan wrote:

On Nov 29, 2007 6:43 PM, Andrew Coppin [EMAIL PROTECTED]
wrote:


I don't understand the ST monad.


...[and ST uses language extensions Andrew doesn't understand.]

(As far as ST goes, runST is unsafePerformIO renamed.  The only tricky
bit is proving safety.)


To put it another way, runST is unsafePerformIO where somebody has  
already done the safety proof for you (so you know it's 100% safe).   
The strange extensions are simply a device to make the safety proof  
work.  Indeed, if you drop the extensions it can all be made to work  
(just say runST :: ST () a - a) but you lose the safety proof and  
it's equivalent to unsafePerformIO.


-Jan

[The trick used in runST is one of my all-time favorite bits of type  
theory, and is what convinced me we wanted second-order types back  
before the first Haskell workshop.]


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Kalman Noel
Sebastian Sylvan:
 primes :: [Integer]
 primes = 2 : filter (null . primeFactors) [3,5..]
 
 primeFactors :: Integer- [Integer]
 primeFactors n = factor n primes
 where
 factor m (p:ps) | p*p  m= []
 | m `mod` p == 0 = p : factor (m `div` p) (p:ps)
 | otherwise  = factor m ps

Your definition gives a strange meaning to primeFactors. I'd want that for all
n,  product (primeFactors n) == n.  I think this law holds for the code posted
by Olivier. Of course I'd beautify his definition slightly by writing

primes = 2 : filter isPrime [3,5..]

isPrime = null . drop 1 . primeFactors

primeFactors n | n = 2 = factor primes n

factor pr@(p:ps) n 
| p*p  n   = [n]
| r == 0= p : factor pr q
| otherwise = factor ps n
where (q,r) = quotRem n p

Kalman

--
Get a free email account with anti spam protection.
http://www.bluebottle.com/tag/2

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Sebastian Sylvan
On Nov 28, 2007 12:12 PM, Kalman Noel [EMAIL PROTECTED] wrote:
 Sebastian Sylvan:
  primes :: [Integer]
  primes = 2 : filter (null . primeFactors) [3,5..]
 
  primeFactors :: Integer- [Integer]
  primeFactors n = factor n primes
  where
  factor m (p:ps) | p*p  m= []
  | m `mod` p == 0 = p : factor (m `div` p) (p:ps)
  | otherwise  = factor m ps

 Your definition gives a strange meaning to primeFactors. I'd want that for all
 n,  product (primeFactors n) == n.  I think this law holds for the code posted
 by Olivier.

Yes you're right. That is property should clearly hold.

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Andrew Coppin

Sebastian Sylvan wrote:

On Nov 28, 2007 12:12 PM, Kalman Noel [EMAIL PROTECTED] wrote:
  

Sebastian Sylvan:


primes :: [Integer]
primes = 2 : filter (null . primeFactors) [3,5..]

primeFactors :: Integer- [Integer]
primeFactors n = factor n primes
where
factor m (p:ps) | p*p  m= []
| m `mod` p == 0 = p : factor (m `div` p) (p:ps)
| otherwise  = factor m ps
  

Your definition gives a strange meaning to primeFactors. I'd want that for all
n,  product (primeFactors n) == n.  I think this law holds for the code posted
by Olivier.



Yes you're right. That is property should clearly hold.
  


There are problems for which it's important to know how many times a 
given prime factor occurs. And there are other problems where it is 
merely necessary to know which primes are factors. I would say it's 
useful to have *both* functions available...




By the way, I'm now using a new and much uglier prime seive:

size = 100 :: Word64

calc_primes :: IO [Word64]
calc_primes = do
   grid - newArray (2,size) True
   seive 2 grid
 where
   seive :: Word64 - IOUArray Word64 Bool - IO [Word64]
   seive p g = do
 mapM_ (\n - writeArray g n False) [p, 2*p .. size]
 mp' - next (p+1) g
 case mp' of
   Nothing - return [p]
   Just p' - do
 ps - seive p' g
 return (p:ps)

   next :: Word64 - IOUArray Word64 Bool - IO (Maybe Word64)
   next p g = do
 if p == size
   then return Nothing
   else do
 t - readArray g p
 if t
   then return (Just p)
   else next (p+1) g

Seems moderately fast. At least, I can find the 10,001st prime in under 
1 second... The obvious downside of course is that you must know how big 
to make the array at the start of your program, and you must compute the 
entire array before you can use any of it. Both limitations not precent 
in the lazy recursive list implementation...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Andrew Coppin

Michaeljohn Clement wrote:

Andrew Coppin wrote:
  

First, somebody else wrote this in C:

int n = 2 , m , primesFound = 0;

for( n=0;n  MAX_NUMBERS;n++ )
if( prime[n] )
{
primesFound++;

if( primesFound == 10001 )
  cout  n   is the 10001st prime.  endl;



Um, I can't *believe* nobody else pointed this 
out, but that isn't C, it's C++.
  


Really? How can you tell?

Well anyway, the guy who wrote it said it's C. I suppose it was a simple 
oversight...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Daniel Fischer
Am Mittwoch, 28. November 2007 22:31 schrieb Andrew Coppin:
 There are problems for which it's important to know how many times a
 given prime factor occurs. And there are other problems where it is
 merely necessary to know which primes are factors. I would say it's
 useful to have *both* functions available...

Yup



 By the way, I'm now using a new and much uglier prime seive:

 size = 100 :: Word64

 calc_primes :: IO [Word64]
 calc_primes = do
 grid - newArray (2,size) True
 seive 2 grid
   where
 seive :: Word64 - IOUArray Word64 Bool - IO [Word64]
 seive p g = do
   mapM_ (\n - writeArray g n False) [p, 2*p .. size]
   mp' - next (p+1) g
   case mp' of
 Nothing - return [p]
 Just p' - do
   ps - seive p' g
   return (p:ps)

 next :: Word64 - IOUArray Word64 Bool - IO (Maybe Word64)
 next p g = do
   if p == size
 then return Nothing
 else do
   t - readArray g p
   if t
 then return (Just p)
 else next (p+1) g

 Seems moderately fast. At least, I can find the 10,001st prime in under
 1 second... 

One thing: since You check the array bounds, the system needn't check them 
again, use unsafeWrite and unsafeRead. And use Int for the index, that would 
be MUCH faster.
Another thing: you can stop sieving when p*p  size, another speedup
Third thing: In my experience mapM_ is relatively slow, hand coded loops are 
faster (though much uglier), but YMMV
Fourth thing: sieve only odd numbers
Fifth thing: better use an STUArray, don't drag IO in if it's not necessary.

 The obvious downside of course is that you must know how big
 to make the array at the start of your program, and you must compute the
 entire array before you can use any of it. Both limitations not precent
 in the lazy recursive list implementation...

The size of the array can easily be estimated by the prime number theorem,
pi(x) ~ x/log x, where pi(x) is the number of primes not exceeding x, so for 
10,001 primes, find x with x/log x about 10,000, add a bit for safety, a 
bound of 120,000 will do.
If you want the n-th prime, you don't want to use the smaller primes anyway, 
if you need all primes, chances are that prime generation will take 
negligible time in comparison to your main algo anyway.

Keep enjoying Project Euler,
Daniel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Sebastian Sylvan
On Nov 28, 2007 9:28 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
 Michaeljohn Clement wrote:
  Andrew Coppin wrote:
 
  First, somebody else wrote this in C:
 
  int n = 2 , m , primesFound = 0;
 
  for( n=0;n  MAX_NUMBERS;n++ )
  if( prime[n] )
  {
  primesFound++;
 
  if( primesFound == 10001 )
cout  n   is the 10001st prime.  endl;
 
 
  Um, I can't *believe* nobody else pointed this
  out, but that isn't C, it's C++.
 

 Really? How can you tell?

cout, endl etc.



-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-28 Thread Brandon S. Allbery KF8NH


On Nov 28, 2007, at 16:28 , Andrew Coppin wrote:


Michaeljohn Clement wrote:

Andrew Coppin wrote:


First, somebody else wrote this in C:

int n = 2 , m , primesFound = 0;

for( n=0;n  MAX_NUMBERS;n++ )
if( prime[n] )
{
primesFound++;

if( primesFound == 10001 )
  cout  n   is the 10001st prime.  endl;



Um, I can't *believe* nobody else pointed this out, but that isn't  
C, it's C++.




Really? How can you tell?



Strictly speaking, the I/O is done with C++ operators and variables.
But the actual algorithm is C, if you replace the cout  ... with  
printf() then the whole thing will be C.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Brent Yorgey
On Nov 27, 2007 2:34 PM, Andrew Coppin [EMAIL PROTECTED] wrote:

 Hi guys.

 Somebody just introduced me to a thing called Project Euler. I gather
 it's well known around here...

 Anyway, I was a little bit concerned about problem #7. The problem is
 basically figure out what the 10,001st prime number is. Consider the
 following two programs for solving this:

 First, somebody else wrote this in C:

 int n = 2 , m , primesFound = 0;

 for( n=0;n  MAX_NUMBERS;n++ )
 if( prime[n] )
 {
  primesFound++;

  if( primesFound == 10001 )
   cout  n   is the 10001st prime.  endl;

  for(m=2*n;mMAX_NUMBERS;m+=n)
   prime[m]=false;
 }

 Second, my implementation in Haskell:

 primes :: [Integer]
 primes = seive [2..]
where
  seive (p:xs) = p : seive (filter (\x - x `mod` p  0) xs)

 main = print (primes !! 1)

 Well, we know who's winning the beauty contest. But here's the worrying
 thing:

  C program: 0.016 seconds
  Haskell program: 41 seconds

 Eeeps! o_O That's Not Good(tm).

 Replacing primes :: [Integer] with primes :: [Word32] speeds the
 Haskell version up to 18 seconds, which is much faster but still a joke
 compared to C.

 Running the Haskell code on a 2.2 GHz AMD Athlon64 X2 instead of a 1.5
 GHz AMD Athlon drops the execution time down to 3 seconds. (So clearly
 the box I was using in my lunch break at work is *seriously* slow...
 presumably the PC running the C code isn't.)

 So, now I have a Haskell version that's only several hundred times
 slower. Neither program is especially optimised, yet the C version is
 drastically faster. This makes me sad. :-(

 By the way... I tried using Data.List.Stream instead. This made the
 program about 40% slower. I also tried both -fasm and -fvia-c. The
 difference was statistically insignificant. (Hey guys, nice work on the
 native codegen!) The difference in compilation speed was fairly drastic
 though, needless to say. (The machine is kinda low on RAM, so trying to
 call GCC makes it thrash like hell. So does linking, actually...)


 Also, I'm stuck with problem #10. (Find the sum of all primes less than
 1 million.) I've let the program run for well over 15 minutes, and still
 no answer is forthcomming. It's implemented using the same primes
 function as above, with a simple filter and sum. (The type has to be
 changed to [Word64] to avoid overflowing the result though.) The other
 guy claims to have a C solution that takes 12 ms. There's a hell of a
 difference between 12 ms and over half an hour...(!) Clearly something
 is horribly wrong here. Uh... help?


The algorithm you use to compute primes is actually quite inefficient, and
in particular, it is NOT the same algorithm that the C program is using,
despite first appearances!  The call to 'filter' in the sieve function works
by checking *every* number for divisibility by p, and only keeping those
that aren't divisible by p; whereas the C program simply marks multiples of
p as non-prime, skipping over those which are not multiples.  So the Haskell
version basically searches for primes by doing trial division on every
single integer, whereas the C version is actually using a sieve.

For more information you might want to read this paper, which includes a
much better primes implementation:

www.cs.hmc.edu/~*oneill*/papers/*Sieve*-JFP.pdf

In fact, this very same topic was discussed on the list a while back, you
can read it here:

http://thread.gmane.org/gmane.comp.lang.haskell.cafe/19699/focus=19798

-Brent
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Don Stewart
andrewcoppin:
 Hi guys.
 
 Somebody just introduced me to a thing called Project Euler. I gather 
 it's well known around here...
 
 Anyway, I was a little bit concerned about problem #7. The problem is 
 basically figure out what the 10,001st prime number is. Consider the 
 following two programs for solving this:
 
 First, somebody else wrote this in C:
 
 int n = 2 , m , primesFound = 0;
 
 for( n=0;n  MAX_NUMBERS;n++ )
 if( prime[n] )
 {
 primesFound++;
 
 if( primesFound == 10001 )
   cout  n   is the 10001st prime.  endl;
 
 for(m=2*n;mMAX_NUMBERS;m+=n)
   prime[m]=false;
 }
 
 Second, my implementation in Haskell:
 
 primes :: [Integer]
 primes = seive [2..]
where
  seive (p:xs) = p : seive (filter (\x - x `mod` p  0) xs)
 
 main = print (primes !! 1)


This is an FAQ. 
Unless you use the same algorithm and data types in benchmarks, you're 
not really benchmarking anything. And expecting one of the worst
possible algorithms to be good is hoping for a little too much :)

When you do use similar data types and algorithms, you get similar
performance:

http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebitslang=all

I'll reiterate: use the same data structures and algorithms, if you want
the same performance.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Henning Thielemann

On Tue, 27 Nov 2007, Andrew Coppin wrote:

 So, now I have a Haskell version that's only several hundred times
 slower. Neither program is especially optimised, yet the C version is
 drastically faster. This makes me sad. :-(

I think the C version is so much faster because it does not need
allocations in the inner loop. If you rewrite your program using a mutable
array your program should become faster - and more ugly. However it might
be fast enough if you create a new array for each run over the array with
a new prime. Then you do not need mutable arrays.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Olivier Boudry
On 11/27/07, Andrew Coppin [EMAIL PROTECTED] wrote:

 Hi guys.

 Somebody just introduced me to a thing called Project Euler. I gather
 it's well known around here...

 Anyway, I was a little bit concerned about problem #7. The problem is
 basically figure out what the 10,001st prime number is. Consider the
 following two programs for solving this:

 First, somebody else wrote this in C:

 int n = 2 , m , primesFound = 0;

 for( n=0;n  MAX_NUMBERS;n++ )
 if( prime[n] )
 {
 primesFound++;

 if( primesFound == 10001 )
cout  n   is the 10001st prime.  endl;

 for(m=2*n;mMAX_NUMBERS;m+=n)
prime[m]=false;
 }

 Second, my implementation in Haskell:

 primes :: [Integer]
 primes = seive [2..]
 where
   seive (p:xs) = p : seive (filter (\x - x `mod` p  0) xs)

 main = print (primes !! 1)

 Well, we know who's winning the beauty contest. But here's the worrying
 thing:

   C program: 0.016 seconds
   Haskell program: 41 seconds

 Eeeps! o_O That's Not Good(tm).

 Replacing primes :: [Integer] with primes :: [Word32] speeds the
 Haskell version up to 18 seconds, which is much faster but still a joke
 compared to C.

 Running the Haskell code on a 2.2 GHz AMD Athlon64 X2 instead of a 1.5
 GHz AMD Athlon drops the execution time down to 3 seconds. (So clearly
 the box I was using in my lunch break at work is *seriously* slow...
 presumably the PC running the C code isn't.)

 So, now I have a Haskell version that's only several hundred times
 slower. Neither program is especially optimised, yet the C version is
 drastically faster. This makes me sad. :-(

 By the way... I tried using Data.List.Stream instead. This made the
 program about 40% slower. I also tried both -fasm and -fvia-c. The
 difference was statistically insignificant. (Hey guys, nice work on the
 native codegen!) The difference in compilation speed was fairly drastic
 though, needless to say. (The machine is kinda low on RAM, so trying to
 call GCC makes it thrash like hell. So does linking, actually...)


 Also, I'm stuck with problem #10. (Find the sum of all primes less than
 1 million.) I've let the program run for well over 15 minutes, and still
 no answer is forthcomming. It's implemented using the same primes
 function as above, with a simple filter and sum. (The type has to be
 changed to [Word64] to avoid overflowing the result though.) The other
 guy claims to have a C solution that takes 12 ms. There's a hell of a
 difference between 12 ms and over half an hour...(!) Clearly something
 is horribly wrong here. Uh... help?

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


Hi Andrew,

I don't remember who I stole this prime computation from, but it is very
fast (10001's prime in 0.06 sec with Int and 0.2 with Integer on my machine)
and not overly complex:

primes :: [Integer]
primes = 2 : filter (l1 . primeFactors) [3,5..]
where
l1 (_:[]) = True
l1  _ = False

primeFactors :: Integer - [Integer]
primeFactors n = factor n primes
where
factor _ [] = []
factor m (p:ps) | p*p  m= [m]
| m `mod` p == 0 = p : factor (m `div` p) (p:ps)
| otherwise  = factor m ps

I used it a lot while playing with Euler Project. Many of the problems
require prime calculation.

Cheers,

Olivier.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Andrew Coppin

Andrew Coppin wrote:
Also, I'm stuck with problem #10. (Find the sum of all primes less 
than 1 million.) I've let the program run for well over 15 minutes, 
and still no answer is forthcomming. It's implemented using the same 
primes function as above, with a simple filter and sum. (The type has 
to be changed to [Word64] to avoid overflowing the result though.) The 
other guy claims to have a C solution that takes 12 ms. There's a hell 
of a difference between 12 ms and over half an hour...(!) Clearly 
something is horribly wrong here. Uh... help?


I just let it run to completion. Took 25 minutes and 15 seconds to find 
the (correct) answer. I would have preferred it to take 12 ms...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Sebastian Sylvan
On Nov 27, 2007 8:54 PM, Olivier Boudry [EMAIL PROTECTED] wrote:


 Hi Andrew,

  I don't remember who I stole this prime computation from, but it is very
 fast (10001's prime in 0.06 sec with Int and 0.2 with Integer on my machine)
 and not overly complex:

  primes :: [Integer]
  primes = 2 : filter (l1 . primeFactors) [3,5..]
  where
  l1 (_:[]) = True
  l1  _ = False

  primeFactors :: Integer - [Integer]
  primeFactors n = factor n primes
  where
  factor _ [] = []
  factor m (p:ps) | p*p  m= [m]
  | m `mod` p == 0 = p : factor (m `div` p) (p:ps)
  | otherwise  = factor m ps

 I used it a lot while playing with Euler Project. Many of the problems
 require prime calculation.



That is indeed a nice and clear version that's pretty fast. It's
basically the same as the C version except backwards (i.e. examine a
number and work backwards through its divisors, rather than filling in
a map of all multiples of a known prime).
Let me suggest the following slight modification (primeFactors in your
version doesn't actually return prime factors - it returns prime
factors *or* a list of just the number itself),

primes :: [Integer]
primes = 2 : filter (null . primeFactors) [3,5..]

primeFactors :: Integer- [Integer]
primeFactors n = factor n primes
where
factor m (p:ps) | p*p  m= []
| m `mod` p == 0 = p : factor (m `div` p) (p:ps)
| otherwise  = factor m ps


This is roughly 35% faster on my machine with GHC 6.7.20070730 too,
but the point wasn't to make it faster, but clearer.
-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Sebastian Sylvan
On Nov 27, 2007 8:44 PM, Andrew Coppin [EMAIL PROTECTED] wrote:
 Andrew Coppin wrote:
  Also, I'm stuck with problem #10. (Find the sum of all primes less
  than 1 million.) I've let the program run for well over 15 minutes,
  and still no answer is forthcomming. It's implemented using the same
  primes function as above, with a simple filter and sum. (The type has
  to be changed to [Word64] to avoid overflowing the result though.) The
  other guy claims to have a C solution that takes 12 ms. There's a hell
  of a difference between 12 ms and over half an hour...(!) Clearly
  something is horribly wrong here. Uh... help?

 I just let it run to completion. Took 25 minutes and 15 seconds to find
 the (correct) answer. I would have preferred it to take 12 ms...



FWIW the following code took 0.77s on my machine:

primes :: [Integer]
primes = 2 : filter (null . primeFactors) [3,5..]

primeFactors :: Integer- [Integer]
primeFactors n = factor n primes
where
factor m (p:ps) | p*p  m= []
| m `mod` p == 0 = p : factor (m `div` p) (p:ps)
| otherwise  = factor m ps


main = print ( sum ( takeWhile ( 100) primes ) )


-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Andrew Coppin
On the other hand, I must relay to you how much fun I had with certain 
other problems.


For example, problem #12. I started with this:

 triangles = scanl1 (+) [1..]

 divisors n = length $ filter (\x - n `mod` x == 0) [1..n]

 answer = head $ dropWhile (\n - divisors n  500) triangles

Sadly, this is *absurdly* slow. I gave up after about 5 minutes of 
waiting. It had only scanned up to T[1,200]. Then I tried the following:


triangles = scanl1 (+) [1..]

primes :: [Word32]
primes = seive [2..]
 where
   seive (p:xs) = p : seive (filter (\x - x `mod` p  0) xs)

factors = work primes
 where
   work (p:ps) n =
 if p  n
   then []
   else
 if n `mod` p == 0
then p : work (p:ps) (n `div` p)
else work ps n

count_factors n =
 let
   fs = factors n
   fss = group fs
 in product $ map ((1+) . length) fss

answer = head $ dropWhile (\n - count_factors n  500) triangles

By looking only at *prime* divisors and then figuring out how many 
divisors there are in total (you don't even have to *compute* what they 
are, just *count* them!), I was able to get the correct solution in 
UNDER ONE SECOND! o_O :-D :^]


Now how about that for fast, eh?

(Actually, having solved it I discovered there's an even better way - 
apparently there's a relationship between the number of divisors of 
consecutive triangular numbers. I'm too noobish to understand the number 
theory here...)



Similarly, problem 24 neatly illustrates everything that is sweet and 
pure about Haskell:


 choose [x] = [(x,[])]
 choose (x:xs) = (x,xs) : map (\(y,ys) - (y,x:ys)) (choose xs)

 permutations [x] = [[x]]
 permutations xs = do
   (x1,xs1) - choose xs
   xs2 - permutations xs1
   return (x1 : xs2)

 answer = (permutations 0123456789) !! 99

This finds the answer in less than 3 seconds. And it is beautiful to 
behold. Yes, there is apparently some more sophisticated algorithm than 
actually enumerating the permutations. But I love the way I threw code 
together in the most intuitive way that fell into my head, and got a 
fast, performant answer. Perfection! :-)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Yitzchak Gale
Andrew Coppin wrote:
 On the other hand, I must relay to you how much fun I had with certain
 other problems.

You may want to look at:

http://haskell.org/haskellwiki/Euler_problems

and make some contributions. But be very careful
what you peek at, so don't spoil your own fun.

Regards,
Yitz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Olivier Boudry
On 11/27/07, Sebastian Sylvan [EMAIL PROTECTED] wrote:

 That is indeed a nice and clear version that's pretty fast. It's
 basically the same as the C version except backwards (i.e. examine a
 number and work backwards through its divisors, rather than filling in
 a map of all multiples of a known prime).
 Let me suggest the following slight modification (primeFactors in your
 version doesn't actually return prime factors - it returns prime
 factors *or* a list of just the number itself),

 primes :: [Integer]
 primes = 2 : filter (null . primeFactors) [3,5..]

 primeFactors :: Integer- [Integer]
 primeFactors n = factor n primes
 where
 factor m (p:ps) | p*p  m= []
 | m `mod` p == 0 = p : factor (m `div` p) (p:ps)
 | otherwise  = factor m ps


 This is roughly 35% faster on my machine with GHC 6.7.20070730 too,
 but the point wasn't to make it faster, but clearer.
 --
 Sebastian Sylvan
 +44(0)7857-300802
 UIN: 44640862


Great remark, it's even simpler like this.

By the way I just found the article I stole this algorithm from:
http://www.haskell.org/haskellwiki/99_questions/31_to_41 last one in problem
35.

Cheers,

Olivier.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Andrew Coppin

Brent Yorgey wrote:
The algorithm you use to compute primes is actually quite inefficient, 
and in particular, it is NOT the same algorithm that the C program is 
using, despite first appearances!  The call to 'filter' in the sieve 
function works by checking *every* number for divisibility by p, and 
only keeping those that aren't divisible by p; whereas the C program 
simply marks multiples of p as non-prime, skipping over those which 
are not multiples.  So the Haskell version basically searches for 
primes by doing trial division on every single integer, whereas the C 
version is actually using a sieve.


I had to reread that several times to figure out what you're actually 
saying.


So the Haskell version tests all N elements to see if P divides them, 
whereas the C version loops over an array (roughly) N / P times flagging 
the composite numbers, which is why it's so much damn faster. (Since as 
P grows, N / P plummets.)


OK, cool. So... now how do I implement this without using mutable 
arrays? :-}


PS. The *original* prime number thing I had was much slower. Used an 
is_prime function to do trial division by *every* number below N. (!!) 
The version I showed as much faster than that, but still quite slow, as 
you say...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A tale of Project Euler

2007-11-27 Thread Andrew Coppin

Don Stewart wrote:
This is an FAQ. 
Unless you use the same algorithm and data types in benchmarks, you're 
not really benchmarking anything. And expecting one of the worst

possible algorithms to be good is hoping for a little too much :)
  


Well, if I was comparing my Haskell against some expert-optimised C 
library then yeah. What puzzled me is that both programs are unoptimised 
and appear to be using *the same* algorithm, and yet the performance is 
incomparable.


Now that Brent has spoken, I understand the error of my ways. ;-)


When you do use similar data types and algorithms, you get similar
performance.


One would hope so, yes... :-)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe