Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Heinrich Apfelmus apfelmus at quantentunnel.de writes: Will Ness wrote: Heinrich Apfelmus writes: Here an example where the VIP merge would give a different result bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : error bad The reason to *not* have the lazy patterns in foldTree for primes, as Daniel Fischer discovered back then, is that they give it a space leak. Case in point - http://ideone.com/DLHp2 : [...] tfold ((x:xs):t) = x : xs `merge` tfold (pairs t) where pairs ((x:xs):ys:t) = (x : merge xs ys) : pairs t hfold xs = serve . foldTree mergeP . map vip $ xs hfold' xs = serve . foldTree' mergeP . map vip $ xs foldTree f ~(x:xs) = x `f` foldTree f (pairs xs) where pairs ~(x: ~(y:ys)) = f x y : pairs ys foldTree' f (x:xs) = x `f` foldTree' f (pairs xs) where pairs (x: (y:ys)) = f x y : pairs ys [...] so hfold' too appears to be over-eager to-the-right, although still more productive than tfold. Ah, the lazy patterns in foldTree are a different issue, sorry for my bad choice of example. While I still don't understand the trade-off that turns the lazy patterns into a space leak, there is no harm in allowing foldTree to see the complete spine of the list. What we do want to forbid is looking at the elements of that list too early. This turns out to be too optimistic a demand on data, in general. In other words, the example should read bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : repeat (error bad : undefined) i.e. the previously unknown tail is replaced with an infinite list of undefined elements. This example can properly distinguish between the not-so-correct tfold and proper VIP implementations (or other implementations that don't do unnecessary comparisons). will have to think this over, but in the mean time, they *both* turn out to be *completely* and utterly *wrong* :) in a general case (although probably for different reasons). Here's where: *Main mapM_ print $ take 5 $ map (take 10) [concatMap (replicate 3) [n,n+1..]|n-[1..]] [1,1,1,2,2,2,3,3,3,4] [2,2,2,3,3,3,4,4,4,5] [3,3,3,4,4,4,5,5,5,6] [4,4,4,5,5,5,6,6,6,7] [5,5,5,6,6,6,7,7,7,8] *Main take 20 $ hfold [concatMap (replicate 3) [n,n+1..]|n-[1..]] [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7] *Main take 20 $ tfold [concatMap (replicate 3) [n,n+1..]|n-[1..]] [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7] when it should'a been [1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,4,4] Cheers, :) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Will Ness will_n48 at yahoo.com writes: ... they *both* turn out to be *completely* and utterly *wrong* :) in a general case (although probably for different reasons). Sorry, my bad. Thought in terms of merge, but the definiton used in VIP code was really an union. When definition was changed to a real merge, non-removing of duplicates, everything was as expected in that case, for both versions. *Main take 20 $ hfold [concatMap (replicate 3) [n,n+1..]|n-[1..]] *Main take 20 $ tfold [concatMap (replicate 3) [n,n+1..]|n-[1..]] [1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,4,4] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Will Ness wrote: Heinrich Apfelmus writes: Here an example where the VIP merge would give a different result bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : error bad We have ghci bad [1,2*** Exception: bad but the VIP version would give at least ghci bad [1,2,3,4*** Exception: bad / Prelude: undefined The reason to *not* have the lazy patterns in foldTree for primes, as Daniel Fischer discovered back then, is that they give it a space leak. Case in point - http://ideone.com/DLHp2 : [...] tfold ((x:xs):t) = x : xs `merge` tfold (pairs t) where pairs ((x:xs):ys:t) = (x : merge xs ys) : pairs t hfold xs = serve . foldTree mergeP . map vip $ xs hfold' xs = serve . foldTree' mergeP . map vip $ xs foldTree f ~(x:xs) = x `f` foldTree f (pairs xs) where pairs ~(x: ~(y:ys)) = f x y : pairs ys foldTree' f (x:xs) = x `f` foldTree' f (pairs xs) where pairs (x: (y:ys)) = f x y : pairs ys [...] so hfold' too appears to be over-eager to-the-right, although still more productive than tfold. Leon Smith wrote: At a glance, it appears to have to do with the irrefutable patterns that treefold uses. At the moment, it's not obvious to me how your example could be made to work without either giving up the ability to work correctly on finite cases, or giving up the tree and going back to foldr merge'. ;-) [...] Also, is there other examples that could distinguish a VIP implementation that works on finite cases, and a no-VIP implementation? Is it possible to have a VIP example that works on finite cases and is maximally lazy in this example? Ah, the lazy patterns in foldTree are a different issue, sorry for my bad choice of example. While I still don't understand the trade-off that turns the lazy patterns into a space leak, there is no harm in allowing foldTree to see the complete spine of the list. What we do want to forbid is looking at the elements of that list too early. In other words, the example should read bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : repeat (error bad : undefined) i.e. the previously unknown tail is replaced with an infinite list of undefined elements. This example can properly distinguish between the not-so-correct tfold and proper VIP implementations (or other implementations that don't do unnecessary comparisons). Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Will Ness wrote: Heinrich Apfelmus writes: Here an example where the VIP merge would give a different result bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : error bad We have ghci bad [1,2*** Exception: bad but the VIP version would give at least ghci bad [1,2,3,4*** Exception: bad / Prelude: undefined The reason to *not* have the lazy patterns in foldTree for primes, as Daniel Fischer discovered back then, is that they give it a space leak. Case in point - http://ideone.com/DLHp2 : [...] tfold ((x:xs):t) = x : xs `merge` tfold (pairs t) where pairs ((x:xs):ys:t) = (x : merge xs ys) : pairs t hfold xs = serve . foldTree mergeP . map vip $ xs hfold' xs = serve . foldTree' mergeP . map vip $ xs foldTree f ~(x:xs) = x `f` foldTree f (pairs xs) where pairs ~(x: ~(y:ys)) = f x y : pairs ys foldTree' f (x:xs) = x `f` foldTree' f (pairs xs) where pairs (x: (y:ys)) = f x y : pairs ys [...] so hfold' too appears to be over-eager to-the-right, although still more productive than tfold. Leon Smith wrote: At a glance, it appears to have to do with the irrefutable patterns that treefold uses. At the moment, it's not obvious to me how your example could be made to work without either giving up the ability to work correctly on finite cases, or giving up the tree and going back to foldr merge'. ;-) [...] Also, is there other examples that could distinguish a VIP implementation that works on finite cases, and a no-VIP implementation? Is it possible to have a VIP example that works on finite cases and is maximally lazy in this example? Ah, the lazy patterns in foldTree are a different issue, sorry for my bad choice of example. While I still don't understand the trade-off that turns the lazy patterns into a space leak, there is no harm in allowing foldTree to see the complete spine of the list. What we do want to forbid is looking at the elements of that list too early. In other words, the example should read bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : repeat (error bad : undefined) i.e. the previously unknown tail is replaced with an infinite list of undefined elements. This example can properly distinguish between the not-so-correct tfold and proper VIP implementations (or other implementations that don't do unnecessary comparisons). Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Leon Smith wrote: Ok, after mulling over the issues that Will Ness has brought up in the last few days [1], I think I have a partial explanation for the apparent tension between Will's observations and Heinrich Apfelmus's Implicit Heaps article [2], which both concern the implementation of mergeAll [3]. [1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666 [2] http://apfelmus.nfshost.com/articles/implicit-heaps.html [3] http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data-List-Ordered.html#v:mergeAll [...] This raises the question, is there some combination of the shape of the merge' tree and some input for which using VIPs dramatically changes the efficiency of a mergeAll operation? I suspect the answer is yes, though I don't know for sure at this point in time. However, I do tacitly believe that the current tree that mergeAll uses doesn't exhibit this property for any input, and so I have simplified the implementations of mergeAll and unionAll in the latest version of data-ordlist-0.4.4 by avoiding the use of VIPs. This has the nice side benefit of modestly improving performance when the elements of the result are highly biased towards the first list. Will Ness For those who remember the discussion about this about a year ago, it turns out there was a simpler version after all lurking somewhere in there (or is it _out_?). primes = 2: primes' where primes' = 3: 5: [7,9..] `minus` tfold [ [p*p,p*p+2*p..] | p - primes' ] tfold ((x:xs):t)= x : xs `union` tfold (pairs t) pairs ((x:xs):ys:t) = (x: union xs ys) : pairs t Unfortunately, it turns out that this program is clear, shorter ... and subtly wrong. :) Here an example where the VIP merge would give a different result bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : error bad We have ghci bad [1,2*** Exception: bad but the VIP version would give at least ghci bad [1,2,3,4*** Exception: bad / Prelude: undefined In other words, this new program already tries to compare the number 3 to the fourth list when it is still clear that only the first three lists are relevant. Note that this doesn't necessarily mean that the program does not work for prime numbers, but *proving* correctness is now considerably more difficult because you need estimates about the growth and distribution of prime numbers. The VIP version always works as long as there are infinitely many primes. Also, since unnecessary comparisons are performed, it is no longer clear whether the time and space complexity stays the same. (Which is not as bad as it sounds, since we didn't know them well in the first place anyway). More worryingly, changing the tree shape now affects correctness. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] V.I.P.s and the associativity of merge'
Heinrich Apfelmus apfelmus at quantentunnel.de writes: Leon Smith wrote: [1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666 [2] http://apfelmus.nfshost.com/articles/implicit-heaps.html [3] http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data- List-Ordered.html#v:mergeAll Will Ness primes = 2: primes' where primes' = 3: 5: [7,9..] `minus` tfold [ [p*p,p*p+2*p..] | p - primes' ] tfold ((x:xs):t)= x : xs `union` tfold (pairs t) pairs ((x:xs):ys:t) = (x: union xs ys) : pairs t Unfortunately, it turns out that this program is clear, shorter ... and subtly wrong. :) Here an example where the VIP merge would give a different result bad = tfold $ (1:10:undefined) : (2:3:5:undefined) : (4:undefined) : error bad We have ghci bad [1,2*** Exception: bad but the VIP version would give at least ghci bad [1,2,3,4*** Exception: bad / Prelude: undefined The reason to *not* have the lazy patterns in foldTree for primes, as Daniel Fischer discovered back then, is that they give it a space leak. Case in point - http://ideone.com/DLHp2 : - 1M primes: 2M primes: --- 3M: --- ideone #: - no-VIPs: smart fold: 1.90s- 4.8MB 4.42s- 4.8MB 7.40s- 4.8MB r3bdL - VIPs: smart fold: 1.95s- 4.8MB 4.53s- 4.8MB 7.45s- 4.8MB 4ACpe simple fold: 2.04s- 4.8MB 4.76s- 4.8MB 7.86s- 4.8MB av9XR lazy pats: 2.44s-20.1MB 5.70s-21.1MB 9.85s-42.6MB DLHp2 Also, having tfold ((x:xs):t) = x : xs `merge` tfold (pairs t) where pairs ((x:xs):ys:t) = (x : merge xs ys) : pairs t hfold xs = serve . foldTree mergeP . map vip $ xs hfold' xs = serve . foldTree' mergeP . map vip $ xs foldTree f ~(x:xs) = x `f` foldTree f (pairs xs) where pairs ~(x: ~(y:ys)) = f x y : pairs ys foldTree' f (x:xs) = x `f` foldTree' f (pairs xs) where pairs (x: (y:ys)) = f x y : pairs ys and bad = (1:10:error 1) : (2:3:5:error 2) : (4:error 4) : error bad bad2 = (1:10:error 1) : (2:3:5:error 2) : (4:error 4) : (5:error 5) : (6:error 6) : (7:error 7) : error bad2 we get *Main hfold bad [1,2,3,4*** Exception: bad *Main hfold' bad [1,2,3,4*** Exception: bad *Main tfold bad [1,2*** Exception: bad *Main hfold bad2 [1,2,3,4*** Exception: 4 *Main hfold' bad2 [1,2,3,4*** Exception: bad2 *Main tfold bad2 [1,2*** Exception: bad2 so hfold' too appears to be over-eager to-the-right, although still more productive than tfold. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] V.I.P.s and the associativity of merge'
Ok, after mulling over the issues that Will Ness has brought up in the last few days [1], I think I have a partial explanation for the apparent tension between Will's observations and Heinrich Apfelmus's Implicit Heaps article [2], which both concern the implementation of mergeAll [3]. The merge' function takes two ordered lists, with the head of the first list less than the head of the second, and merges their contents: merge' [] ys = ys merge' (x:xs) ys = x : merge xs ys The nice thing about this function is we can merge an infinite number of lists by folding right, if assume that the heads of those lists are appropriately ordered. This appears in Richard Bird's code at the end of Melissa O'Neill's Genuine Sieve of Eratosthenes, though undoubtedly this observation has been independently made by many people. Now, in an ordinary sense, merge' *is* an associative operator, in that given three fully defined ordered lists with ordered heads, merge' xs (merge' ys zs) == merge' (merge' xs ys) zs. This allows us to merge an infinite number of lists using an arbitrary tree of merge' operations, without ever worrying that we will return an incorrect result. (However, we might get stuck in a non-productive loop, for example if we fold left over an infinite list of ordered lists) However, Heinrich's article uses a stronger sense of associativity which includes laziness properties and thus captures some operational characteristics. In this sense, merge' *is not* associative.To remedy this, Heinrich uses VIPs to strengthen the associativity property of merge'. This raises the question, is there some combination of the shape of the merge' tree and some input for which using VIPs dramatically changes the efficiency of a mergeAll operation? I suspect the answer is yes, though I don't know for sure at this point in time. However, I do tacitly believe that the current tree that mergeAll uses doesn't exhibit this property for any input, and so I have simplified the implementations of mergeAll and unionAll in the latest version of data-ordlist-0.4.4 by avoiding the use of VIPs. This has the nice side benefit of modestly improving performance when the elements of the result are highly biased towards the first list. Best, Leon [1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666 [2] http://apfelmus.nfshost.com/articles/implicit-heaps.html [3] http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data-List-Ordered.html#v:mergeAll ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe