Arie Groeneveld wrote:
Hi,
Wondering about time space consuming: 'nub' vs 'map head.group.sort'
Consider:
ry = [1..10000] ++ replicate 13 5 ++ replicate 21 34
*Main> length . nub $ ry
10000
(5.18 secs, 1050000 bytes)
*Main> length . map head . group . sort $ ry
10000
(0.03 secs, 6293384 bytes)
Time space
nub --- +
fnub +++ -
+ is better ;-)
Thanks
@@i=arie
nub is working on unsorted input.
If you want (nub.sort) then the best thing to use is a merge sort that discards
duplicates as it works. Copying and modifying GHC's Data.List.sort code:
> -- stolen from http://darcs.haskell.org/packages/base/Data/List.hs
> -- with 'merge' changed to discard duplicates.
nsort l = mergesort compare l
nsortBy cmp l = mergesort compare l
mergesort :: (a -> a -> Ordering) -> [a] -> [a]
mergesort cmp = mergesort' cmp . map wrap
mergesort' :: (a -> a -> Ordering) -> [[a]] -> [a]
mergesort' cmp [] = []
mergesort' cmp [xs] = xs
mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)
merge_pairs :: (a -> a -> Ordering) -> [[a]] -> [[a]]
merge_pairs cmp [] = []
merge_pairs cmp [xs] = [xs]
merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss
merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
merge cmp xs [] = xs
merge cmp [] ys = ys
merge cmp (x:xs) (y:ys)
= case x `cmp` y of
GT -> y : merge cmp (x:xs) ys
LT -> x : merge cmp xs (y:ys)
EQ -> x : merge cmp xs ys
wrap :: a -> [a]
wrap x = [x]
Then you can use nsort or nsortBy, which benchmark (with -O2) as slightly faster
than (map head . group . sort)
Cheers,
Chris
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe