Re: [Haskell-cafe] appending an element to a list
Lets look at the actual reductions going on. To make the example easier, I would like to use last instead of your complicated until. It shouldn't make a difference. [ winds up doing twice as much work ] This was also my intuition. I had a function that built up a large output list by generating chunks and ++ them onto the output (e.g. basically concatMap). My intuition was that this would get gradually slower because each (++) implied a copy of the previous part of the list. So if I have another function consuming the eventual output it will get slower and slower because demanding an element will reduce the (++)s for each chunk, and copy the element 1000 times if I have appended 1000 chunks by now. So I switched to using DList, which has O(1) append. Then I ran timing on a list that wound up adding up a million or so chunks... and both versions were exactly the same speed (and -O2 gave both an equally big speed boost). In fact, when I try with a toy example, the DList using one is much slower than the concatMap using one, which I don't quite understand. Shouldn't concatMap be quite inefficient? The program below gives me this output: 1500 45.7416 1500 110.2112 -O2 brings both implementations to 45. Interestingly, if I call 't1' from ghci, it sucks up 1gb of memory and slows the system to a crawl until I kill it.. this is *after* it's computed a result and given me my prompt back... even if its somehow keeping the whole list around in some ghci variable (where?), isn't 1gb+ a lot even for 15m boxed Integers? And why does it continue to grow after the function has completed? The compiled version doesn't have this problem. import System.CPUTime import qualified Data.DList as DList dconcat_map f xs = DList.toList (DList.concat (map (DList.fromList . f) xs)) mkchunk n = [n, n*2, n*3] main = do t1 t2 t1 = do let a = concatMap mkchunk [0..500] t - getCPUTime print (last a) t2 - getCPUTime print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision) t2 = do let a = dconcat_map mkchunk [0..500] t - getCPUTime print (last a) t2 - getCPUTime print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
On Wed, 2008-06-11 at 17:18 -0700, Evan Laforge wrote: Lets look at the actual reductions going on. To make the example easier, I would like to use last instead of your complicated until. It shouldn't make a difference. [ winds up doing twice as much work ] This was also my intuition. I had a function that built up a large output list by generating chunks and ++ them onto the output (e.g. basically concatMap). My intuition was that this would get gradually slower because each (++) implied a copy of the previous part of the list. So if I have another function consuming the eventual output it will get slower and slower because demanding an element will reduce the (++)s for each chunk, and copy the element 1000 times if I have appended 1000 chunks by now. So I switched to using DList, which has O(1) append. Then I ran timing on a list that wound up adding up a million or so chunks... and both versions were exactly the same speed (and -O2 gave both an equally big speed boost). In fact, when I try with a toy example, the DList using one is much slower than the concatMap using one, which I don't quite understand. Shouldn't concatMap be quite inefficient? concatMap is very efficient. (++) isn't slow. Left associative uses of (++) are slow. concatMap = foldr ((++) . f) [] The program below gives me this output: 1500 45.7416 1500 110.2112 -O2 brings both implementations to 45. Interestingly, if I call 't1' from ghci, it sucks up 1gb of memory and slows the system to a crawl until I kill it.. this is *after* it's computed a result and given me my prompt back... even if its somehow keeping the whole list around in some ghci variable (where?), isn't 1gb+ a lot even for 15m boxed Integers? And why does it continue to grow after the function has completed? The compiled version doesn't have this problem. import System.CPUTime import qualified Data.DList as DList dconcat_map f xs = DList.toList (DList.concat (map (DList.fromList . f) xs)) mkchunk n = [n, n*2, n*3] main = do t1 t2 t1 = do let a = concatMap mkchunk [0..500] t - getCPUTime print (last a) t2 - getCPUTime print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision) t2 = do let a = dconcat_map mkchunk [0..500] t - getCPUTime print (last a) t2 - getCPUTime print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
My $0.02 is to say -- O(1) longList ++ [5] Yay. I've got a thunk. Oh wait, I need to access the '5'? No different than doing so for -- O(n) until ((==5) . head) [l,o,n,g,L,i,s,t,5] It's not the (++) that's O(n). It's the list traversal. I can further beat this pedantic point to death by pointing out there is no difference between longList ++ [5] and longList ++ (repeat 5) Getting to the first 5 is still O(n). Cheers, -ljr Tillmann Rendel wrote: Adrian Neumann wrote: Hello, I was wondering how expensive appending something to a list really is. Say I write I'd say longList ++ [5] stays unevaluated until I consumed the whole list and then appending should go in O(1). Similarly when concatenating two lists. Is that true, or am I missing something? I think that is true and you are missing something: You have to push the call to ++ through the whole longList while consuming it wholy one element at a time! So when longList has n elements, you have (n+1) calls of ++, each returning after O(1) steps. The first n calls return a list with the ++ pushed down, and the last returns [5]. Summed together, that makes O(n) actual calls of ++ for one written by the programmer. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Lanny Ripple [EMAIL PROTECTED] ScmDB / Cisco Systems, Inc. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
Lanny Ripple wrote: My $0.02 is to say -- O(1) longList ++ [5] Yay. I've got a thunk. Oh wait, I need to access the '5'? No different than doing so for -- O(n) until ((==5) . head) [l,o,n,g,L,i,s,t,5] It's not the (++) that's O(n). It's the list traversal. Lets look at the actual reductions going on. To make the example easier, I would like to use last instead of your complicated until. It shouldn't make a difference. Lets look at the reduction of (last longList) to whnf first: L) last longList L) ~ last ongList L) ~ last ngList L) ~ last gList L) ~ last List L) ~ last ist L) ~ last st L) ~ last t ~ 't' The L prefixed marks all lines which are reduced by calls to last. Clearly, we need n reduction steps here. Now, what about last (longList ++ !)? A) last (longList ++ !) L) ~ last ('l' : (ongList ++ !)) A) ~ last (ongList ++ !) L) ~ last ('o' : (ngList ++ !)) A) ~ last (ngList ++ !) L) ~ last ('n' : (gList ++ !)) A) ~ last (gList ++ !) L) ~ last ('g' : (List ++ !)) A) ~ last (List ++ !) L) ~ last ('L' : (ist ++ !)) A) ~ last (ist ++ !) L) ~ last ('i' : (st ++ !)) A) ~ last (st ++ !) L) ~ last ('s' : (t ++ !)) A) ~ last (t ++ !) L) ~ last ('t' : ( ++ !)) A) ~ last ( ++ !) L) ~ last ! ~ '!' Calls to ++ are marked with A (for append). Now, we have to reduce a call to ++ everytime before we can reduce a call to last, so we have n steps for calls of last as before + n steps for interleaved calls of ++ + 1 step for the final call of ++ + 1 step for the final call of last = 2n + 2 steps in total The difference between (2n + 2) and (n) is (n + 2) and lies clearly in O(n). So, by the addition of ++ ! to our program, we had to do O(n) reduction steps more. Since we had to do O(n) reductions steps anyway, this didn't show up in the overall complexity, but our program is only half as fast, instead of constant amount slower, which seems to make a difference to me. And other programs could suffer even more badly, since their complexity could go up, e.g., from O(n) to O(n^2). A simple example is this naive reverse function: reverse [] = [] reverse (x:xs) = reverse xs ++ [x] let's see how (last (reverse long)) is reduced to whnf. I will not even attempt longList ... R) last (reverse long) R) ~ last (reverse ong ++ l) R) ~ last ((reverse ng ++ o) ++ l) R) ~ last (((reverse g ++ n) ++ o) ++ l) R) ~ last reverse ++ g) ++ n) ++ o) ++ l) R) ~ last ++ g) ++ n) ++ o) ++ l) At this point, we have reduced reverse in n steps to an expression containing n calls to ++. If ++ were O(1), we would need only O(n) additional steps to finish the job. But see what happens: last ++ g) ++ n) ++ o) ++ l) A) ~ last (((g ++ n) ++ o) ++ l) The first ++ was easy, only 1 reduction step. last (((g ++ n) ++ o) ++ l) A) ~ last ((('g' : ( ++ n)) ++ o) ++ l) A) ~ last (('g' : (( ++ n) ++ o)) ++ l) A) ~ last ('g' : ((( ++ n) ++ o) ++ l)) L) ~ last ((( ++ n) ++ o) ++ l) A) ~ last ((n ++ o) ++ l) Oups, for the second ++, we needed n reduction steps to move the first char out of all these nested ++'s. last ((n ++ o) ++ l) A) ~ last (('n' : ( ++ o)) ++ l) A) ~ last ('n' : (( ++ o) ++ l)) L) ~ last (( ++ o) ++ l) A) ~ last (o ++ l) Another (n - 1) reduction steps for the second ++ to go away. last (o ++ l) A) ~ last ('o' : ++ l)) L) ~ last ( ++ l) A) ~ last (l) L) ~ 'l' And the third and fourth ++ go away with (n - 2) and (n - 3) reduction steps. Counting together, we had to use n + (n - 1) + (n - 2) + ... = n! reduction steps to get rid of the n calls to ++, which lies in O(n^2). Thats what we expected of course, since we know that each of the ++ would need O(n) steps. I can further beat this pedantic point to death by pointing out there is no difference between longList ++ [5] and longList ++ (repeat 5) Getting to the first 5 is still O(n). That's a different question. For the complexity of ++, the right-hand side operand is irrelevant. The n means the length of the left-hand side operand here. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] appending an element to a list
Hello, I was wondering how expensive appending something to a list really is. Say I write I'd say longList ++ [5] stays unevaluated until I consumed the whole list and then appending should go in O(1). Similarly when concatenating two lists. Is that true, or am I missing something? Adrian signature.asc Description: OpenPGP digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
Hi, Am Donnerstag, den 29.05.2008, 19:04 +0200 schrieb Adrian Neumann: I was wondering how expensive appending something to a list really is. Say I write I'd say longList ++ [5] stays unevaluated until I consumed the whole list and then appending should go in O(1). Similarly when concatenating two lists. Is that true, or am I missing something? I’m no expert, but I give it shot: The problem is that longList might be referenced somewhere else as well, so it has to be kept around, ending in [], not in [5]. But (longList ++ [5]) also might be referenced somewhere, so you also need to keep that list. Thus you have to copy the whole list structure for the appending (not the values, though). For comparision, with (5:longList), the new list can use the old list unmodified, so nothing has to be copied. You can also observe this in the code for (++): (++) :: [a] - [a] - [a] (++) [] ys = ys (++) (x:xs) ys = x : xs ++ ys where you can see that on the right hand side, a totally new list is constructed. In some cases, e.g. when the longList is only referenced there and nowhere else, one might hope that the compiler can optimize this problem away. There is some hope, as I see this in the code: {-# RULES ++[~1] forall xs ys. xs ++ ys = augment (\c n - foldr c n xs) ys #-} Maybe some core-literate people can give more information on this? Greetings, Joachim -- Joachim nomeata Breitner mail: [EMAIL PROTECTED] | ICQ# 74513189 | GPG-Key: 4743206C JID: [EMAIL PROTECTED] | http://www.joachim-breitner.de/ Debian Developer: [EMAIL PROTECTED] signature.asc Description: Dies ist ein digital signierter Nachrichtenteil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
Adrian Neumann wrote: Hello, I was wondering how expensive appending something to a list really is. Say I write I'd say longList ++ [5] stays unevaluated until I consumed the whole list and then appending should go in O(1). Similarly when concatenating two lists. Is that true, or am I missing something? I think that is true and you are missing something: You have to push the call to ++ through the whole longList while consuming it wholy one element at a time! So when longList has n elements, you have (n+1) calls of ++, each returning after O(1) steps. The first n calls return a list with the ++ pushed down, and the last returns [5]. Summed together, that makes O(n) actual calls of ++ for one written by the programmer. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] appending an element to a list
On Thu, May 29, 2008 at 11:48 PM, Tillmann Rendel [EMAIL PROTECTED] wrote: Adrian Neumann wrote: Hello, I was wondering how expensive appending something to a list really is. Say I write I'd say longList ++ [5] stays unevaluated until I consumed the whole list and then appending should go in O(1). Similarly when concatenating two lists. Is that true, or am I missing something? I think that is true and you are missing something: You have to push the call to ++ through the whole longList while consuming it wholy one element at a time! So when longList has n elements, you have (n+1) calls of ++, each returning after O(1) steps. The first n calls return a list with the ++ pushed down, and the last returns [5]. Summed together, that makes O(n) actual calls of ++ for one written by the programmer. Tillmann In other words, if you look at the prototype of ++ given in the prelude, it generates a new list with first (length longList) elements same as those of longList, followed by the second list. So when you are accessing elements of (longList ++ s), you are actually accessing the elements of this newly generated list, which are generated as and when you access them, so that by the time you reach the first element of s, you have generated (length longList) elements of the result of ++. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe