Re: [Haskell-cafe] appending an element to a list

2008-06-11 Thread Evan Laforge
 Lets look at the actual reductions going on. To make the example easier,
 I would like to use last instead of your complicated until. It shouldn't
 make a difference.

[ winds up doing twice as much work ]

This was also my intuition.  I had a function that built up a large
output list by generating chunks and ++ them onto the output (e.g.
basically concatMap).  My intuition was that this would get gradually
slower because each (++) implied a copy of the previous part of the
list.  So if I have another function consuming the eventual output it
will get slower and slower because demanding an element will reduce
the (++)s for each chunk, and copy the element 1000 times if I have
appended 1000 chunks by now.

So I switched to using DList, which has O(1) append.  Then I ran
timing on a list that wound up adding up a million or so chunks... and
both versions were exactly the same speed (and -O2 gave both an
equally big speed boost).

In fact, when I try with a toy example, the DList using one is much
slower than the concatMap using one, which I don't quite understand.
Shouldn't concatMap be quite inefficient?

The program below gives me this output:

1500
45.7416
1500
110.2112

-O2 brings both implementations to 45.

Interestingly, if I call 't1' from ghci, it sucks up 1gb of memory and
slows the system to a crawl until I kill it.. this is *after* it's
computed a result and given me my prompt back... even if its somehow
keeping the whole list around in some ghci variable (where?), isn't
1gb+ a lot even for 15m boxed Integers?  And why does it continue to
grow after the function has completed?  The compiled version doesn't
have this problem.


import System.CPUTime
import qualified Data.DList as DList

dconcat_map f xs = DList.toList (DList.concat (map (DList.fromList . f) xs))

mkchunk n = [n, n*2, n*3]

main = do
t1
t2

t1 = do
let a = concatMap mkchunk [0..500]
t - getCPUTime
print (last a)
t2 - getCPUTime
print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision)

t2 = do
let a = dconcat_map mkchunk [0..500]
t - getCPUTime
print (last a)
t2 - getCPUTime
print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-06-11 Thread Derek Elkins
On Wed, 2008-06-11 at 17:18 -0700, Evan Laforge wrote:
  Lets look at the actual reductions going on. To make the example easier,
  I would like to use last instead of your complicated until. It shouldn't
  make a difference.
 
 [ winds up doing twice as much work ]
 
 This was also my intuition.  I had a function that built up a large
 output list by generating chunks and ++ them onto the output (e.g.
 basically concatMap).  My intuition was that this would get gradually
 slower because each (++) implied a copy of the previous part of the
 list.  So if I have another function consuming the eventual output it
 will get slower and slower because demanding an element will reduce
 the (++)s for each chunk, and copy the element 1000 times if I have
 appended 1000 chunks by now.
 
 So I switched to using DList, which has O(1) append.  Then I ran
 timing on a list that wound up adding up a million or so chunks... and
 both versions were exactly the same speed (and -O2 gave both an
 equally big speed boost).
 
 In fact, when I try with a toy example, the DList using one is much
 slower than the concatMap using one, which I don't quite understand.
 Shouldn't concatMap be quite inefficient?

concatMap is very efficient.  (++) isn't slow.  Left associative uses of
(++) are slow.  concatMap = foldr ((++) . f) []

 
 The program below gives me this output:
 
 1500
 45.7416
 1500
 110.2112
 
 -O2 brings both implementations to 45.
 
 Interestingly, if I call 't1' from ghci, it sucks up 1gb of memory and
 slows the system to a crawl until I kill it.. this is *after* it's
 computed a result and given me my prompt back... even if its somehow
 keeping the whole list around in some ghci variable (where?), isn't
 1gb+ a lot even for 15m boxed Integers?  And why does it continue to
 grow after the function has completed?  The compiled version doesn't
 have this problem.
 
 
 import System.CPUTime
 import qualified Data.DList as DList
 
 dconcat_map f xs = DList.toList (DList.concat (map (DList.fromList . f) xs))
 
 mkchunk n = [n, n*2, n*3]
 
 main = do
 t1
 t2
 
 t1 = do
 let a = concatMap mkchunk [0..500]
 t - getCPUTime
 print (last a)
 t2 - getCPUTime
 print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision)
 
 t2 = do
 let a = dconcat_map mkchunk [0..500]
 t - getCPUTime
 print (last a)
 t2 - getCPUTime
 print (fromIntegral (t2 - t) / fromIntegral cpuTimePrecision)
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-05-30 Thread Lanny Ripple
My $0.02 is to say

  -- O(1)
  longList ++ [5]

Yay.  I've got a thunk.  Oh wait, I need to access the '5'?  No
different than doing so for

  -- O(n)
  until ((==5) . head) [l,o,n,g,L,i,s,t,5]

It's not the (++) that's O(n).  It's the list traversal.  I can
further beat this pedantic point to death by pointing out there is
no difference between

  longList ++ [5]

and

  longList ++ (repeat 5)

Getting to the first 5 is still O(n).

  Cheers,
  -ljr


Tillmann Rendel wrote:
 
 
 Adrian Neumann wrote:
 Hello,

 I was wondering how expensive appending something to a list really is.
 Say I write

 I'd say longList ++ [5] stays unevaluated until I consumed the whole
 list and then appending should go in O(1). Similarly when
 concatenating two lists.

 Is that true, or am I missing something?
 
 I think that is true and you are missing something: You have to push the
 call to ++ through the whole longList while consuming it wholy one
 element at a time! So when longList has n elements, you have (n+1) calls
 of ++, each returning after O(1) steps. The first n calls return a list
 with the ++ pushed down, and the last returns [5]. Summed together, that
 makes O(n) actual calls of ++ for one written by the programmer.
 
   Tillmann
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

-- 
Lanny Ripple [EMAIL PROTECTED]
ScmDB / Cisco Systems, Inc.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-05-30 Thread Tillmann Rendel

Lanny Ripple wrote:

My $0.02 is to say

-- O(1) longList ++ [5]

Yay.  I've got a thunk.  Oh wait, I need to access the '5'?  No 
different than doing so for


-- O(n) until ((==5) . head) [l,o,n,g,L,i,s,t,5]

It's not the (++) that's O(n).  It's the list traversal.


Lets look at the actual reductions going on. To make the example easier,
I would like to use last instead of your complicated until. It shouldn't
make a difference.

Lets look at the reduction of (last longList) to whnf first:

L)  last longList
L)  ~  last ongList
L)  ~  last ngList
L)  ~  last gList
L)  ~  last List
L)  ~  last ist
L)  ~  last st
L)  ~  last t
~  't'

The L prefixed marks all lines which are reduced by calls to last.
Clearly, we need n reduction steps here.

Now, what about last (longList ++ !)?

A)  last (longList ++ !)
L)  ~  last ('l' : (ongList ++ !))
A)  ~  last (ongList ++ !)
L)  ~  last ('o' : (ngList ++ !))
A)  ~  last (ngList ++ !)
L)  ~  last ('n' : (gList ++ !))
A)  ~  last (gList ++ !)
L)  ~  last ('g' : (List ++ !))
A)  ~  last (List ++ !)
L)  ~  last ('L' : (ist ++ !))
A)  ~  last (ist ++ !)
L)  ~  last ('i' : (st ++ !))
A)  ~  last (st ++ !)
L)  ~  last ('s' : (t ++ !))
A)  ~  last (t ++ !)
L)  ~  last ('t' : ( ++ !))
A)  ~  last ( ++ !)
L)  ~  last !
~  '!'

Calls to ++ are marked with A (for append). Now, we have to reduce a
call to ++ everytime before we can reduce a call to last, so we have

n steps for calls of last as before
  + n steps for interleaved calls of ++
  + 1 step for the final call of ++
  + 1 step for the final call of last
  = 2n + 2 steps in total

The difference between (2n + 2) and (n) is (n + 2) and lies clearly in
O(n). So, by the addition of ++ ! to our program, we had to do O(n)
reduction steps more.

Since we had to do O(n) reductions steps anyway, this didn't show up in
the overall complexity, but our program is only half as fast,
instead of constant amount slower, which seems to make a difference to me.

And other programs could suffer even more badly, since their complexity
could go up, e.g., from O(n) to O(n^2). A simple example is this naive
reverse function:

  reverse [] = []
  reverse (x:xs) = reverse xs ++ [x]

let's see how (last (reverse long)) is reduced to whnf. I will not
even attempt longList ...

R)  last (reverse long)
R)  ~  last (reverse ong ++ l)
R)  ~  last ((reverse ng ++ o) ++ l)
R)  ~  last (((reverse g ++ n) ++ o) ++ l)
R)  ~  last reverse  ++ g) ++ n) ++ o) ++ l)
R)  ~  last  ++ g) ++ n) ++ o) ++ l)

At this point, we have reduced reverse in n steps to an expression
containing n calls to ++. If ++ were O(1), we would need only O(n)
additional steps to finish the job. But see what happens:

last  ++ g) ++ n) ++ o) ++ l)
A)  ~  last (((g ++ n) ++ o) ++ l)

The first ++ was easy, only 1 reduction step.

last (((g ++ n) ++ o) ++ l)
A)  ~  last ((('g' : ( ++ n)) ++ o) ++ l)
A)  ~  last (('g' : (( ++ n) ++ o)) ++ l)
A)  ~  last ('g' : ((( ++ n) ++ o) ++ l))
L)  ~  last ((( ++ n) ++ o) ++ l)
A)  ~  last ((n ++ o) ++ l)

Oups, for the second ++, we needed n reduction steps to move the first
char out of all these nested ++'s.

last ((n ++ o) ++ l)
A)  ~  last (('n' : ( ++ o)) ++ l)
A)  ~  last ('n' : (( ++ o) ++ l))
L)  ~  last (( ++ o) ++ l)
A)  ~  last (o ++ l)

Another (n - 1) reduction steps for the second ++ to go away.

last (o ++ l)
A)  ~  last ('o' :  ++ l))
L)  ~  last ( ++ l)
A)  ~  last (l)
L)  ~  'l'

And the third and fourth ++ go away with (n - 2) and (n - 3) reduction
steps. Counting together, we had to use

  n + (n - 1) + (n - 2) + ... = n!

reduction steps to get rid of the n calls to ++, which lies in O(n^2).
Thats what we expected of course, since we know that each of the ++
would need O(n) steps.


I can further beat this pedantic point to death by pointing out there
is no difference between

longList ++ [5]

and

longList ++ (repeat 5)

Getting to the first 5 is still O(n).


That's a different question. For the complexity of ++, the right-hand
side operand is irrelevant. The n means the length of the left-hand side
operand here.

  Tillmann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] appending an element to a list

2008-05-29 Thread Adrian Neumann

Hello,

I was wondering how expensive appending something to a list really is. 
Say I write


I'd say longList ++ [5] stays unevaluated until I consumed the whole 
list and then appending should go in O(1). Similarly when concatenating 
two lists.


Is that true, or am I missing something?

Adrian



signature.asc
Description: OpenPGP digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-05-29 Thread Joachim Breitner
Hi,

Am Donnerstag, den 29.05.2008, 19:04 +0200 schrieb Adrian Neumann:
 I was wondering how expensive appending something to a list really is. 
 Say I write
 
 I'd say longList ++ [5] stays unevaluated until I consumed the whole 
 list and then appending should go in O(1). Similarly when concatenating 
 two lists.
 
 Is that true, or am I missing something?

I’m no expert, but I give it shot:

The problem is that longList might be referenced somewhere else as well,
so it has to be kept around, ending in [], not in [5]. But (longList ++
[5]) also might be referenced somewhere, so you also need to keep that
list. Thus you have to copy the whole list structure for the appending
(not the values, though).

For comparision, with (5:longList), the new list can use the old list
unmodified, so nothing has to be copied.


You can also observe this in the code for (++):

(++) :: [a] - [a] - [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys

where you can see that on the right hand side, a totally new list is
constructed.


In some cases, e.g. when the longList is only referenced there and
nowhere else, one might hope that the compiler can optimize this problem
away. There is some hope, as I see this in the code:

{-# RULES
++[~1] forall xs ys. xs ++ ys = augment (\c n - foldr c n xs) ys
  #-}

Maybe some core-literate people can give more information on this?

Greetings,
Joachim

-- 
Joachim nomeata Breitner
  mail: [EMAIL PROTECTED] | ICQ# 74513189 | GPG-Key: 4743206C
  JID: [EMAIL PROTECTED] | http://www.joachim-breitner.de/
  Debian Developer: [EMAIL PROTECTED]


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-05-29 Thread Tillmann Rendel



Adrian Neumann wrote:

Hello,

I was wondering how expensive appending something to a list really is. 
Say I write


I'd say longList ++ [5] stays unevaluated until I consumed the whole 
list and then appending should go in O(1). Similarly when concatenating 
two lists.


Is that true, or am I missing something?


I think that is true and you are missing something: You have to push the 
call to ++ through the whole longList while consuming it wholy one 
element at a time! So when longList has n elements, you have (n+1) calls 
of ++, each returning after O(1) steps. The first n calls return a list 
with the ++ pushed down, and the last returns [5]. Summed together, that 
makes O(n) actual calls of ++ for one written by the programmer.


  Tillmann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] appending an element to a list

2008-05-29 Thread Abhay Parvate
On Thu, May 29, 2008 at 11:48 PM, Tillmann Rendel [EMAIL PROTECTED]
wrote:



 Adrian Neumann wrote:

 Hello,

 I was wondering how expensive appending something to a list really is. Say
 I write

 I'd say longList ++ [5] stays unevaluated until I consumed the whole
 list and then appending should go in O(1). Similarly when concatenating two
 lists.

 Is that true, or am I missing something?


 I think that is true and you are missing something: You have to push the
 call to ++ through the whole longList while consuming it wholy one element
 at a time! So when longList has n elements, you have (n+1) calls of ++, each
 returning after O(1) steps. The first n calls return a list with the ++
 pushed down, and the last returns [5]. Summed together, that makes O(n)
 actual calls of ++ for one written by the programmer.

  Tillmann


In other words, if you look at the prototype of ++ given in the prelude, it
generates a new list with first (length longList) elements same as those of
longList, followed by the second list. So when you are accessing elements of
(longList ++ s), you are actually accessing the elements of this newly
generated list, which are generated as and when you access them, so that by
the time you reach the first element of s, you have generated (length
longList) elements of the result of ++.



 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe