On Sat, Nov 22, 2008 at 1:20 PM, Jason Dusek <[EMAIL PROTECTED]> wrote: > Ryan Ingram <[EMAIL PROTECTED]> wrote: >> ...persistent data structures tend to have much worse constant >> factors and those factors translate to a general 2x-3x >> slowdown. > > Can you explain why that is, or provide a citation for it? > It's not something I've found easy to Google.
Consider insertion into a simple binary tree (no balancing condition). The persistent algorithm is: insert :: Key -> Tree -> Tree insert k Tip = Node k Nil Nil insert k (Node k' l r) | k < k' = Node k' (insert k l) r | otherwise = Node k' l (insert k r) The ephemeral algorithm is: insert :: Key -> IORef Tree -> IO () insert k p = do t <- readIORef p case t of Tip -> do l <- newIORef Tip r <- newIORef Tip writeIORef p (Node k l r) Node k' l r -> insert k $ if k < k' then l else r The big difference between these two algorithms is the amount of allocation and copying going on. Both dereference basically the same number of pointers. The ephemeral algorithm allocates exactly one new node and modifies exactly one pointer in memory. The persistent algorithm, on the other hand, copies the entire path being traversed down the tree, allocating that many nodes as well. (All of the "Tip" nodes can be shared; it can be treated like "NULL" in C) Unfortunately, I don't have any references; the 2-3x is an intuitive number from my past experience. It's worse for algorithms where you need to explicitly simulate pointers with maps because the structure is inherently ephemeral, and better for simple structures like the aforementioned binary tree. -- ryan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe