Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders(was:Genuine Eratosthenes sieve)

Melissa O'Neill Sun, 25 Feb 2007 22:15:54 -0800

Claus, you're absolutely right that my paper could be improved in anumber of ways, and that there is actually a surprising amount ofmeat on these bones for further analysis. We'll see what happens.If it actually gets accepted for JFP, it'll undoubtedly finallyappear as a revised version with less controversial language. Inhindsight, using loaded words was a huge mistake, because of the wayit distracts from my main points.

But one thing I've learned is that it's better to do somethingimperfect and get it out there for people to read than to try to makeit perfect and end up with a perpetual work in progress. I had abusy summer last summer, and the paper was as good as I could make itin the very short amount of time I had. It may have annoyed you insome ways, but hopefully it informed you in others.


Claus wrote:

something in me balks at the idea that the folkore version of thesieve is not "the" sieve *algorithm*, in spite of your detailedcross-off footprint and performance curve arguments.


I don't think you're alone in that viewpoint.

But it's clear to me that while some people are aware of thedifferences in crossing-off behavior, the performance costs thatentails, and the way the ``folklore sieve'' becomes a dual of thevery-naive-primes algorithm (trial division by all the primes lessthan n) -- and despite those differences, want to say it's all"okay"; there also are plenty of people who are unaware of thesesubtleties and believe that they have an honest-to-goodnessfunctional implementation of one of the fastest prime finding methodsout there, are hugely disappointed at the performance they get, andascribe it not to poor algorithmic choices, but to the language theyare working with.

As I allude to in my paper, I've talked to various people locallywho've seen the ``folklore sieve'', and it's fascinating to watch thelight go on for them, and hear exclamations like ``Oh! So *that's*why it's so slow!''. (FWIW, I think it was those reactions, whichincluded for some a sense of having been duped, that lead me to feelokay with the rather strong terms I used in my paper.)


Claus also wrote:

but we are in the grey area between "permute until sorted" and"sort like this", and thinking of the modulus of a number as astatic property (that might be derived incrementally from that ofits predecessor) rather than an explicit check

I think modulus is a bit of a red herring -- its a symptom, not thedisease. I would argue that if "the agent that crosses off 17s" hasto look at 19 and say "that's okay, let that one pass through",whether it passes it through based on modulus, or because it is lessthan 289, it's still an examination of 19 by the cross-off-17 agentwhere that agent has the power to allow 19 through or not. Seenthat way, I have a very hard time accepting the algorithm as afaithful implementation of the Sieve of Eratosthenes. Similarly, Ihave a hard time accepting the faithfulness an algorithm where 70doesn't get a visit from the cross-off agents for 5 and 7.

But I agree though, that it may be possible to perform progressivetransformations from a ``folklore sieve'', which doesn't cross offaccording to the proper conventions to something that does.Somewhere along the way, there's going to be a grey area, and arguingabout definitions in that grey area is likely to be a waste of time.At some point in the middle, you'll have something that can be viewedfrom both perspectives. But the existence of such a progression ofalgorithms between algorithm A and algorithm B doesn't mean that Aand B are fundamentally the same.

as it happens, even if we run folklore3 over [2..], 70 will becrossed offexactly once, by the sieve for 2. the sieves for 5 and 7 run overthe gap
left behind, as they do in folklore and folklore2 (one could move that
gap jumping from sieve3 to insert, though..). the sieves for 5 and7 "know" about 70 the same way that (`mod`5) and (`mod`7) knowabout 70, but that knowledge isn't called upon for sieving, only tofind greater numbers with modulus 0.

Perhaps we're saying the same thing and misunderstanding each other,but here's what I see on an instrumented version of your code thatshows the state of affairs at each loop iteration:

    ...
   (Prime 67,[(2,68),(3,69),(5,70),(7,70),(11,121), ...]),
   (Loops 68,[(2,68),(3,69),(5,70),(7,70),(11,121), ...]),
   (Loops 69,[(3,69),(2,70),(5,70),(7,70),(11,121), ...]),
   (Loops 70,[(2,70),(5,70),(7,70),(3,72),(11,121), ...]),
   (Loops 71,[(5,70),(7,70),(2,72),(3,72),(11,121), ...]),
   (Loops 71,[(7,70),(2,72),(3,72),(5,75),(11,121), ...]),
   (Prime 71,[(2,72),(3,72),(5,75),(7,77),(11,121), ...]),
   (Loops 72,[(2,72),(3,72),(5,75),(7,77),(11,121), ...]),
   (Loops 73,[(3,72),(2,74),(5,75),(7,77),(11,121), ...]),
   (Prime 73,[(2,74),(3,75),(5,75),(7,77),(11,121), ...]),
   (Loops 74,[(2,74),(3,75),(5,75),(7,77),(11,121), ...]),
   (Loops 75,[(3,75),(5,75),(2,76),(7,77),(11,121), ...]),
   (Loops 76,[(5,75),(2,76),(7,77),(3,78),(11,121), ...]),
   (Loops 76,[(2,76),(7,77),(3,78),(5,80),(11,121), ...]),
   (Loops 77,[(7,77),(2,78),(3,78),(5,80),(11,121), ...]),
   (Loops 78,[(2,78),(3,78),(5,80),(7,84),(11,121), ...]),
   (Loops 79,[(3,78),(2,80),(5,80),(7,84),(11,121), ...]),
   (Prime 79,[(2,80),(5,80),(3,81),(7,84),(11,121), ...]),
    ...

You may say that 70 is crossed off once, but the fact that the othertwo iterators for 70 move on when you're ``looking at'' 71 seems likesplitting hairs. There is work that exactly corresponds to the threefactors of 70. In the earlier versions, there is no such work -- 70gets filtered out by the 2s and never seen again.

In any case, it looks to me like a genuine sieve (and like my cousinof my technique) in a way that the earlier iterations do not.

Perhaps here's another way of putting it. A real Sieve ofEratosthenes ought to be easy to augment to output ("*for free*") notjust a list of primes, but a list of primes and composites where weshow how to factor each of the composites -- for each composite c, weget a list of all the unique factors of c <= sqrt(c). (ImagineEratosthenes not merely crossing out each multiple of 17 butannotating it so that we know it was a multiple of 17).

Although it would take a little bit of post-processing, it's easy tosee how we could extract that information from the output above, andhopefully fairly easy to see that we could modify your algorithm (ormine) to output the information directly. The ``folklore sieves''cannot do this without a heavy performance penalty.

From an operational behavior standpoint, the ``folklore sieve'' justdoesn't do the same thing. Every expectation people have about thebehavior of the Sieve of Eratosthenes is confounded by the ``folkloresieve''. Although, if all you're looking for is a short and elegant-looking *specification* that brings to mind the Sieve ofEratosthenes, the ``folklore sieve'' will suit just fine. To methough, the Sieve of Eratosthenes doesn't exist as a specificationfor primality -- their are easier definitions -- it serves as aclever way to find primes (and factor composites) efficiently. Thefundamental operational details matter.


    Melissa.

P.S.  Here's my instrumented version of Claus's code:

primes = folklore3 [] [2..]

data Status a = Prime a
              | Loops a
            deriving (Eq, Ord, Read, Show)

folklore3 pns xs = (mx,pns) : more mx
  where (mx,pns',xs') = sieve3 pns xs
        more (Prime x) = folklore3 (insert (x,x*x) pns') xs'
        more (Loops x) = folklore3 pns' xs'

sieve3 ((p,n):pns) (x:xs)
                 | x<n       = (Prime x, ((p,n):pns),xs)

| otherwise = (Loops x, insert (p,(n+p)) pns, [x|x>n]++xs)

sieve3 []          (x:xs)    = (Prime x, [],xs)

insert (p,n) []                        = [(p,n)]
insert (p,n) ((p',n'):pns) | n<=n'     = (p,n):(p',n'):pns
                           | otherwise = (p',n'):insert (p,n) pns

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders(was:Genuine Eratosthenes sieve)

Reply via email to