Re: [GHC] #1218: Add sortNub and sortNubBy to Data.List
#1218: Add sortNub and sortNubBy to Data.List +--- Reporter: neil| Owner: Type: proposal| Status: closed Priority: normal | Milestone: Not GHC Component: libraries/base |Version: 6.8.2 Severity: normal | Resolution: wontfix Keywords: | Difficulty: Unknown Testcase: | Architecture: Unknown Os: Linux | +--- Changes (by guest): * cc: [EMAIL PROTECTED] (added) * version: = 6.8.2 * os: Unknown = Linux Comment: This may be abandoned, but I'm curious. Is this actually an optimization? I've been trying out some toy examples, and the 'map head . group . sort' version seems to perform absolutely terribly compared to just 'sort . nub', at least on my sample. Example: [EMAIL PROTECTED]:15969~cat tmp2.hs [ 9:58AM] import Data.List main = print $ uniq $ take 10 $ cycle [1] uniq = sort . nub% [EMAIL PROTECTED]:15970~time runhaskell tmp2.hs [ 9:58AM] [1] runhaskell tmp2.hs 45.90s user 0.23s system 66% cpu 1:09.24 total [EMAIL PROTECTED]:15971~cat tmp4.hs [ 9:59AM] import Data.Set main = print $ uniq $ take 10 $ cycle [1] uniq = toList . fromList [EMAIL PROTECTED]:15974~time runhaskell tmp4.hs [10:00AM] [1] runhaskell tmp4.hs 38.14s user 0.14s system 96% cpu 39.743 total [EMAIL PROTECTED]:15975~cat tmp3.hs [10:02AM] import Data.List main = print $ uniq $ take 10 $ cycle [1] uniq = map head . group . sort [EMAIL PROTECTED]:15976~ You'll understand if I don't provide timings for tmp3.hs (the map head implementation) since it takes vastly longer than 30 seconds and eats so much RAM and CPU time that it locks my system overnight. Admittedly, perhaps a list of repeated items is not the best usecase, but shouldn't optimizations - particularly optimizations in the base library be a win in all cases? (and particularly not huge pessimizations in some cases)? -- gwern -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1218#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #1218: Add sortNub and sortNubBy to Data.List
#1218: Add sortNub and sortNubBy to Data.List +--- Reporter: neil| Owner: Type: proposal| Status: closed Priority: normal | Milestone: Not GHC Component: libraries/base |Version: Severity: normal | Resolution: wontfix Keywords: | Difficulty: Unknown Testcase: | Architecture: Unknown Os: Unknown | +--- Changes (by igloo): * status: new = closed * resolution: = wontfix Comment: This proposal seems to be abandoned -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1218#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #1218: Add sortNub and sortNubBy to Data.List
That would actually be wrong. It's easy to come up with examples where this identity is false. For instance: data T = T Int Int deriving (Ord, Show) instance Eq T where T x _ == T y _ = x == y ts = [T 1 1, T 1 undefined] On Mar 18, 2007, at 23:51 , GHC wrote: #1218: Add sortNub and sortNubBy to Data.List +--- Reporter: neil| Owner: Type: proposal| Status: new Priority: normal | Milestone: Not GHC Component: libraries/base |Version: Severity: normal | Resolution: Keywords: | Difficulty: Unknown Testcase: | Architecture: Unknown Os: Unknown | +--- Comment (by dons): How about rather than changing the api, purely for efficiency reasons, we just add a rewrite rule to Data.List for this optimisation? The rule would be: {{{ {-# RULES sort/nub sort . nub = map head . group . sort #-} }}} sort . nub will be rewritten to the optimised case, and we won't need to further complicate the API. Quite often now it is possible for the library author to expose only a small, generic API, while providing internal rules for specific optimised cases. This keeps the interface small, while still providing performance. Data.ByteString does this in a number of places, for example: {{{ {-# RULES FPS specialise filter (== x) forall x. filter (== x) = filterByte x #-} }}} I'd really not like to see more functions exposed in the list interface, only as optimisations, that could be better provided via RULES. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1218 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #1218: Add sortNub and sortNubBy to Data.List
#1218: Add sortNub and sortNubBy to Data.List +--- Reporter: neil| Owner: Type: proposal| Status: new Priority: normal | Milestone: Not GHC Component: libraries/base |Version: Severity: normal | Resolution: Keywords: | Difficulty: Unknown Testcase: | Architecture: Unknown Os: Unknown | +--- Comment (by dons): How about rather than changing the api, purely for efficiency reasons, we just add a rewrite rule to Data.List for this optimisation? The rule would be: {{{ {-# RULES sort/nub sort . nub = map head . group . sort #-} }}} sort . nub will be rewritten to the optimised case, and we won't need to further complicate the API. Quite often now it is possible for the library author to expose only a small, generic API, while providing internal rules for specific optimised cases. This keeps the interface small, while still providing performance. Data.ByteString does this in a number of places, for example: {{{ {-# RULES FPS specialise filter (== x) forall x. filter (== x) = filterByte x #-} }}} I'd really not like to see more functions exposed in the list interface, only as optimisations, that could be better provided via RULES. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1218 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs