[Haskell-cafe] Queues and Rings (Re: Doubly-linked zipper list w/insert implementation)

apfelmus Sat, 10 Nov 2007 12:31:46 -0800

(Btw, this ring stuff could be relevant for Xmonad, I don't know whetherthe workspace/window-ring implementation there is O(1). Not that itmatters for <1000 windows, of course :)


Justin Bailey wrote:

apfelmus wrote:

Do you really need to realize the cycle by sharing? I mean, sharing
doesn't go well with insertion / updates / deletion since each of these
operations breaks it and needs to restore it everywhere. In other words,
your  insert  takes O(n) time. I'd simply drop the sharing and use two
double ended queues (or something like that) instead


Very good point, and much easier to implement with Data.Sequence to
boot. All that circular programming made my brain hurt.

There's also a direct and rather lightweight possibility to implementrings in the spirit of the classic O(1) lazy amortized functional queueimplementation. This post will try to explain it.

Here's the basic idea for implementing queues in Haskell: we have afront list to fetch items (head, tail) and a rear list to insertitems (snoc) into the queue.


  data Queue a = Queue [a] [a]

  empty                = Queue [] []
  head (Queue (x:f) r) = x
  tail (Queue (x:f) r) = Queue f r
  snoc (Queue f r) x   = Queue f (x:r)

Of course, this doesn't quite work yet, at some point we have to feedthe items from the rear list into the front list. For example, the lastpossibility to do so is when the front list becomes empty.


  balance (Queue [] r) = Queue (reverse r) []
  balance q            = q

  tail (Queue (x:f) r) = balance $ Queue f r
  snoc (Queue f r) x   = balance $ Queue f (x:r)

(Calling balance maintains the invariant that the front list is neverempty except when the whole queue is empty, too.) Now, how much timewill a single snoc or tail operation take? In the worst case, tailtriggers a reverse and takes O(n) time whereas snoc always takesconstant time. That's a big blow to our goal of O(1) time for both.

But luckily, queues don't come out of "thin air", they all have to beconstructed from the empty queue by a sequence of applications of snocand tail . Can the heavy O(n) cost of a worst case tail be spreadover the many good cases of tail and snoc in that sequence? Yes, itcan. To that end, we increase the price of each snoc by 1 time "coin".So, each item of the rear list gets inserted with one extra coin as"credit". With these credits, we can pay the whole length (rear list)cost of a reverse operation when it occurs, making tail O(1) again.This is also called _amortization_ and O(1) the _amortized_ cost of tail .

The above works fine if the queue is used in a single-threaded way i.e.as _ephemeral_ data structure. But it doesn't work anymore when a queueis used multiple times in a _persistent_ setting. Assuming that tail qtriggers a reverse , the first evaluation of q1 in


  let
     q1 = tail q
     q2 = tail q
     q3 = tail q
     ...
  in ... q1 .. q2 .. q3

will use up all credits and q2, q3,... don't have any to spend and areback to worst-case behavior.

In the persistent setting, lazy evaluation comes to the rescue. The ideais to create the (yet unevaluated) call to reverse earlier, namelywhen the rear list has more elements than the front list.


  balance (Queue f r)
     | length r >= length f = Queue (f ++ reverse r) []
  balance q = q

(We assume that length has been made O(1) by storing the lengthsexplicitly.) Now, the O(length r) reverse will not be evaluated beforehaving "tailed" through the previous front list with length f == lengthr items. Thus, we can spread the cost of reverse as "debits" overthese elements. When finally executing reverse , its debits havealready been paid off and tail is O(1) again. And once executed, lazyevaluation memoizes the result, so that sharing doesn't duplicate the work.(Note that strict languages without side effects are doomed to be slowerwhen persistence matters. Ha! ;)

So much for a too short introduction to the classic purely functionalqueue implementation. For a detailed exposition and much more, see also


  Chris Okasaki. Purely Functional Data Structures. (Thesis)
  http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

or his book with the same title which arose from this thesis.


Now, rings can be implemented in a similar style.

  data Ring a = Ring [a] a [a]

  rotL (Ring ls x (r:rs)) = balance $ Ring (x:ls) r rs
  rotR (Ring (l:ls) x rs) = balance $ Ring ls l (x:rs)

(For simplicity, we only deal with the case where the left and rightlist are non-empty.)How to balance? In contrast to queues, doing a full reverse when onelist is empty doesn't even work in the ephemeral case since a rotRfollowing a rotL will undo the reverse with yet another expensivereverse . But we can apply the same idea as for persistent queues andbalance as soon as one list becomes like 2 times (or 3 or whatever) aslarge as the other one


  balance (Ring ls x rs)
    | length ls > 2*length rs = r'
    | length rs > 2*length ls = r'
    where
    n  = length ls + length rs
    k  = n `div` 2
    r' = Ring
           (take k     $ ls ++ reverse (drop (n-k) rs))
           x
           (take (n-k) $ rs ++ reverse (drop k     ls))

  balance r = r

This will make  rotL  and  rotR  run in O(1) amortized time.


Exercises:

1) Complete the implementation of rotL and rotR . Besides dealing withpossibly empty ls and rs , it's also possible to assume themnon-empty and use special constructors for rings with <= 2 elements.2) Use the balancing scheme for rings to implement double-ended queues,i.e. queues with both cons and snoc .

3) Read Okasaki's book and prove my O(1) claims :)


Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Queues and Rings (Re: Doubly-linked zipper list w/insert implementation)

Reply via email to