[Haskell-cafe] Re: List of all powers

apfelmus Wed, 14 Nov 2007 12:38:06 -0800

Brent Yorgey wrote:

Kurt Hutchinson wrote:

As part of a solution I'm working on for Project Euler problem 119, I
wanted to create an ordered list of all powers of all positive
integers (starting with squares).


The merging can be done much more simply and efficiently (this is code I
wrote when computing squarefree numbers in a blog
post<http://byorgey.wordpress.com/2007/09/01/squarefree-numbers-in-haskell/>a
few months ago):

-- merge two nondecreasing lists.
( # ) :: (Ord a) => [a] -> [a] -> [a]
[] # ys = ys
xs # [] = xs
xs@(x:xt) # ys@(y:yt) | x < y = x : (xt # ys)
                      | x > y = y : (xs # yt)
                      | otherwise = x : (xt # yt)

-- merge an infinite list of lists, assuming that each list
-- is nondecreasing and the lists occur in order of their first
-- element.
mergeAll :: (Ord a) => [[a]] -> [a]
mergeAll ([] : zs) = mergeAll zs
mergeAll (xxs@(x:xs) : yys@(y:ys) : zs)
    | x < y = x : mergeAll (xs : yys : zs)
    | otherwise = mergeAll ((xxs # yys) : zs)


Then you can simply define

powers = 1 : mergeAll powertable

I wrote some code to sum the first n powers and print the result, and
compiled (using -O2) first with your method, then with mergeAll.  Summing
the first 7000 powers took ~8s and ~0.1s respectively, so that's a pretty
big speedup. Based on seeing how the times scale, I suspect that your code
is O(n^2) or something of that sort, whereas mergeAll is essentially linear,
although without scrutinizing your code more closely I'm not exactly sure
why that would be the case.

In principle, both Kurt's and even your mergeAll are O(n^2), but thatdepends on the actual distribution of the numbers in the powertable. Inboth cases, the optimization over a naive implementation is to increasethe number of rows to be considered only if the next output came fromthe last row. This is ok since of the last row, only the head elementwas considered so far and the non-considered elements all have to bebigger than this.

Kurt's code is slower because it takes the minimum over _all_ theconsidered rows at every step. This is unnecessary since only oneelement changed, many comparisons can be "cached". In other words, thiscalls for a heap. Brent's code does not produce a heap, but I it's stillable to cache lots of comparisons.


Kurt may want to transpose the  powertable  to

  2^2 3^2 4^2 5^2 ..
  2^3 3^3 4^3 5^3 ..
  2^4 3^4 4^4 ..
  ..

instead of the current

  2^2 2^3 2^4 2^5 ..
  3^2 3^3 3^4 3^5 ..
  4^2 4^3 4^4 ..
  ..

since the first elements of the rows of the former grows far steeperthan the ones of the latter. This means that only few rows are to beconsidered each turn.

However, Brent may not want to transpose the powertable since thesteep increase in every single row (as opposed to across rows) isexactly what makes his code fast. During evaluation, his tower of callsto # will compare something like the following numbers:


  2^5   3^4   4^3   5^2

Thanks the form of the tower, the comparisons of the first elements are"cached". It looks like


  mergeAll $ (2^5:(__ # 3^4:__) # 4^3:__) : (5^2:xs) : __

The winner is 5^2 and mergeAll will proceeds by expecting the head ofxs . But this one will first be compared to 2^5 = minimum [2^5,3^4,4^3]that's what I mean with "cached". Similarly, the winner 3^4 = minimum[2^5,3^4] is cached, too. In other words, the minimum of every initialsegment of the numbers considered is "cached". The crucial point now isthat those initial numbers quickly become very large (2^7 jumpsexponentially to 2^8 and so on) and the winners are mostly to be foundat the end of the list. With the caching, this is cheap to check. IfBrent were to transpose the powertable , winners are more to the frontof the list and the caching is useless, most likely rendering hisimplementation inferior to Kurt's one.

Now, back to the heap and O(n*log n). There is no need to use anexplicit heap data structure, it's possible to arrange the merges toform an implicit heap, i.e. using the best form of "caching". Here's anexplanation of the technique with mergesort


  http://www.mail-archive.com/[EMAIL PROTECTED]/msg19980.html

(Hm, does gmane gradually forget old messages?). The only problem is tomake this work on an infinite list. Dave Bayer discovered a great way todo this, here's an explanation


  http://thread.gmane.org/gmane.comp.lang.haskell.cafe/26426/focus=26493


Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: List of all powers

Reply via email to