Andrew Coppin wrote: > apfelmus wrote: >> Note that the one usually adds an "end of string" character $ in the >> Burrows-Wheeler transform for compression such that sorting rotated >> strings becomes sorting suffices. > > Yeah, I noticed that the output from by program can never actually be > reverted to its original form.
Well it can, but that's a different story told in Richard S. Bird and Shin-Cheng Mu. Inverting the Burrows-Wheeler transform. http://web.comlab.ox.ac.uk/oucl/work/richard.bird/publications.html #DBLP:journals/jfp/BirdM04 Oh, and you had a function inv_bwt, right? >> Concerning the algorithm at hand, you can clearly avoid calculating >> Raw.append over and over: >> >> bwt :: Raw.ByteString -> Raw.ByteString >> bwt xs = Raw.pack . map (Raw.last) . sort $ rotations >> where >> n = length xs >> rotations = take n . map (take n) . tails $ xs `Raw.append` xs >> >> assuming that take n is O(1). > > I was trying to avoid O(n^2) RAM usage. :-} Note that for ByteStrings, this takes only O(n) RAM because the substrings are shared. But for lists, this would take O(n^2) RAM since (take n) cannot share hole sublists. An O(n) choice for lists that doesn't recalculate ++ all the time would be bwt :: Ord a => [a] -> [a] bwt xs = map last . sortBy (compare `on` (take n)) $ rotations where n = length xs rotations = take n . tails $ xs ++ xs with the well-known on :: (a -> a -> c) -> (b -> a) -> (b -> b -> c) on g f x y = g (f x) (f y) Regards, apfelmus _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe