Arie Groeneveld wrote:
Hi,

Wondering about time space consuming:  'nub' vs 'map head.group.sort'

Consider:

ry = [1..10000] ++ replicate 13 5 ++ replicate 21 34



*Main> length . nub $ ry
10000
(5.18 secs, 1050000 bytes)

*Main> length . map head . group . sort $ ry
10000
(0.03 secs, 6293384 bytes)

       Time     space
nub     ---      +
fnub    +++      -

+ is better ;-)

Thanks

@@i=arie


nub is working on unsorted input.

If you want (nub.sort) then the best thing to use is a merge sort that discards duplicates as it works. Copying and modifying GHC's Data.List.sort code:

> -- stolen from http://darcs.haskell.org/packages/base/Data/List.hs
> -- with 'merge' changed to discard duplicates.

nsort l = mergesort compare l
nsortBy cmp l = mergesort compare l

mergesort :: (a -> a -> Ordering) -> [a] -> [a]
mergesort cmp = mergesort' cmp . map wrap

mergesort' :: (a -> a -> Ordering) -> [[a]] -> [a]
mergesort' cmp [] = []
mergesort' cmp [xs] = xs
mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)

merge_pairs :: (a -> a -> Ordering) -> [[a]] -> [[a]]
merge_pairs cmp [] = []
merge_pairs cmp [xs] = [xs]
merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss

merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a]
merge cmp xs [] = xs
merge cmp [] ys = ys
merge cmp (x:xs) (y:ys)
 = case x `cmp` y of
        GT -> y : merge cmp (x:xs)   ys
        LT -> x : merge cmp    xs (y:ys)
        EQ -> x : merge cmp    xs    ys

wrap :: a -> [a]
wrap x = [x]


Then you can use nsort or nsortBy, which benchmark (with -O2) as slightly faster than (map head . group . sort)

Cheers,
  Chris
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to