Re: [Haskell-cafe] Re: Need for speed: the Burrows-Wheeler Transform

Andrew Coppin Sat, 23 Jun 2007 03:42:46 -0700

apfelmus wrote:

Andrew Coppin wrote:

Yeah, I noticed that the output from by program can never actually be
reverted to its original form.


Well it can, but that's a different story told in

  Richard S. Bird and Shin-Cheng Mu.
  Inverting the Burrows-Wheeler transform.
  http://web.comlab.ox.ac.uk/oucl/work/richard.bird/publications.html
  #DBLP:journals/jfp/BirdM04

Oh, and you had a function inv_bwt, right?

Implementation #1 had an inverse - but it works by finding an unusedcharacter and prepending that to the input before doing the forward BWT. ;-)


For the other implementations, there is no inverse at all.

I was trying to avoid O(n^2) RAM usage. :-}


Note that for ByteStrings, this takes only O(n) RAM because the
substrings are shared. But for lists, this would take O(n^2) RAM since
(take n) cannot share hole sublists.


Presumably that's only true for *strict* ByteStrings?

(I still don't actually understand how lazy bytestrings are anydifferent to normal lists - or why they produced such a massiveperformance increase...)

An O(n) choice for lists that
doesn't recalculate ++ all the time would be

   bwt :: Ord a => [a] -> [a]
   bwt xs = map last . sortBy (compare `on` (take n)) $ rotations
     where
     n         = length xs
     rotations = take n . tails $ xs ++ xs

with the well-known

   on :: (a -> a -> c) -> (b -> a) -> (b -> b -> c)
   on g f x y = g (f x) (f y)


Interesting... But how does it perform? ;-)

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: Need for speed: the Burrows-Wheeler Transform

Reply via email to