Re: [Haskell-cafe] Is fusion overrated?
> which is capable of producing elements one-by-one. So the whole thing > probably should run in constant space as well. Besides reducing the amount of GC calls, performance would also improve because the GC calls that remain are cheaper. The original program may run in constant space, but the fused program may use even a smaller constant space. Which in turn means that whenever the GC needs to make a pass, it is faster. Facundo ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
On Wed, May 18, 2011 at 2:04 AM, Ryan Ingram wrote: > Yes, the goal isn't so much to improve complexity (both are O(1)) but to > reduce the constant factor on that O(1). > > In an inner loop like that, allocator/gc calls by far dominate the cost of > the program. If you can remove them, you've improved the performance of the > program by 10-100x. Yes, this is important. Fusion is an obvious win on strict structures, because it can make the space usage asymptotically better. However, even if this doesn't happen for lazy structures, fusion can save a lot of _time_. It won't make the time complexity asymptotically better, but it can save an amount of work proportional to the complexity of the algorithm. For instance, I was playing around with uvector a while back, and needed foldr or something similar that wasn't available. So I wrote something like: foldr f z s = if null s then z else f (head s) (tail s) This all ran in constant space, but tail re-unfolds the entire stream every time, so this function has time complexity O(n^2). The nth element chugs through n allocation-followed-by-deallocations before it becomes usable, which can all be done in constant space, but takes linear time. Fusion won't save you in this example (to my knowledge). But if you compose k functions from lists to lists together, you'll have k allocations-followed-by-deallocations on every element that makes it all the way through the pipeline. You'll see O(1) space usage, but your time usage has a c*k*n term simply from being expressed by a composition pipeline, where c is the cost of the unnecessary boxing. Fusion eliminates this term. -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
Roman Cheplyaka wrote: > > Of course I don't claim that fusion is useless -- just trying to > understand the problem it solves. Are we saving a few closures and cons > cells here? In addition to what everyone else said, fusion can be a big win when it allows further optimisations. For instance, fusing map (+1) . map (+2) can eliminate 1 addition per iteration. Even without taking allocation into account, most of the reasons for why loop fusion is a worthwhile optimisation in C apply to Haskell, too! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
Also, we do fusion on strict structures (e.g. vectors), where you get back O(n) on each fused point. Obviously, it is less of a win on lazy structures than the (pathological) case of strict data, but it is still a win. -- Don On Tue, May 17, 2011 at 11:07 PM, Ben Lippmeier wrote: > > On 18/05/2011, at 15:55 , Roman Cheplyaka wrote: >> Of course I don't claim that fusion is useless -- just trying to >> understand the problem it solves. Are we saving a few closures and cons >> cells here? > > And thunk allocations, and thunk entries. Entering a thunk costs upwards of > 20 cycles, while performing a single addition should only cost one. Imagine > every thunk entry is a function call. You don't want to call a whole function > just to add two numbers together. > > Those "few closures and cons cells" can be surprisingly expensive when > compared to native ALU instructions on a modern machine. > > Ben. > > > > > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
On 18/05/2011, at 15:55 , Roman Cheplyaka wrote: > Of course I don't claim that fusion is useless -- just trying to > understand the problem it solves. Are we saving a few closures and cons > cells here? And thunk allocations, and thunk entries. Entering a thunk costs upwards of 20 cycles, while performing a single addition should only cost one. Imagine every thunk entry is a function call. You don't want to call a whole function just to add two numbers together. Those "few closures and cons cells" can be surprisingly expensive when compared to native ALU instructions on a modern machine. Ben. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Is fusion overrated?
Yes, the goal isn't so much to improve complexity (both are O(1)) but to reduce the constant factor on that O(1). In an inner loop like that, allocator/gc calls by far dominate the cost of the program. If you can remove them, you've improved the performance of the program by 10-100x. In the case where everything is Int, you can even unbox and get entirely in registers, which gives you comparable performance to a hand-tuned C or assembly language loop. -- ryan On Tue, May 17, 2011 at 10:55 PM, Roman Cheplyaka wrote: > If one thinks about Haskell data structures as of ordinary data > structures, fusion seems a great deal -- instead of producing > intermediate lists and possibly running out of memory, we just run a > loop and use constant amount of space. > > But Haskell data structures are quite different -- they are produced as > demanded. Consider the example from the Stream Fusion paper[1]: > >f :: Int → Int >f n = sum [ k ∗ m | k ← [1..n], m ← [1..k ] ] > > Assuming the sum is a strict left fold, it consumes elements of lists > one-by-one and runs in constant space. > > The list part can be transformed to > >foldr (++) [] $ map (\k -> map (\m -> k*m) [1..k]) [1..n] > > which is capable of producing elements one-by-one. So the whole thing > probably should run in constant space as well. > > Of course I don't claim that fusion is useless -- just trying to > understand the problem it solves. Are we saving a few closures and cons > cells here? > > [1] Stream Fusion. From Lists to Streams to Nothing at All. >Duncan Coutts, Roman Leshchinskiy, Don Stewart. > > -- > Roman I. Cheplyaka :: http://ro-che.info/ > Don't worry what people think, they don't do it very often. > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Is fusion overrated?
If one thinks about Haskell data structures as of ordinary data structures, fusion seems a great deal -- instead of producing intermediate lists and possibly running out of memory, we just run a loop and use constant amount of space. But Haskell data structures are quite different -- they are produced as demanded. Consider the example from the Stream Fusion paper[1]: f :: Int → Int f n = sum [ k ∗ m | k ← [1..n], m ← [1..k ] ] Assuming the sum is a strict left fold, it consumes elements of lists one-by-one and runs in constant space. The list part can be transformed to foldr (++) [] $ map (\k -> map (\m -> k*m) [1..k]) [1..n] which is capable of producing elements one-by-one. So the whole thing probably should run in constant space as well. Of course I don't claim that fusion is useless -- just trying to understand the problem it solves. Are we saving a few closures and cons cells here? [1] Stream Fusion. From Lists to Streams to Nothing at All. Duncan Coutts, Roman Leshchinskiy, Don Stewart. -- Roman I. Cheplyaka :: http://ro-che.info/ Don't worry what people think, they don't do it very often. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe