Re: [Haskell-cafe] ordNub

2013-10-13 Thread AntC
Niklas Hambüchen mail at nh2.me writes: In sets, the order does not matter, while for nub it does. Let's be careful here!. Niklas, when you say order, do you mean: * the _ordering_ from the Ord instance? Or * the relative sequence of elements in the list? ... the fact that Set is used

Re: [Haskell-cafe] ordNub

2013-10-13 Thread Niklas Hambüchen
On 13/10/13 21:42, AntC wrote: Niklas Hambüchen mail at nh2.me writes: In sets, the order does not matter, while for nub it does. Let's be careful here!. Niklas, when you say order, do you mean: * the _ordering_ from the Ord instance? Or * the relative sequence of elements in the list?

Re: [Haskell-cafe] ordNub

2013-10-13 Thread AntC
Niklas Hambüchen mail at nh2.me writes: On 13/10/13 21:42, AntC wrote: ... If you use the Set library, that fact may be very visible! Because Set re-sequences the whole list, as per its Ord instance. But List.nub preserves the list sequence (except for omitting duplicates). I

Re: [Haskell-cafe] ordNub

2013-10-13 Thread Niklas Hambüchen
On 14/10/13 03:20, AntC wrote: Thanks Niklas, I hadn't spotted those benchmarks back in July. No worries :) I'm surprised at that result for singletons (and for very small numbers of elements which are in fact each different). I think one of the main reasons for the performance difference

Re: [Haskell-cafe] ordNub

2013-10-13 Thread AntC
Niklas Hambüchen mail at nh2.me writes: On 14/10/13 03:20, AntC wrote: ... Then here's a further possible optimisation, instead of making separate calls to `member` and `insert`: This I understand again. Where do you get insert' from? containers doesn't seem to have it. Do you

Re: [Haskell-cafe] ordNub

2013-10-12 Thread Niklas Hambüchen
I would like to come back to the original question: How can ordNub be added to base? I guess we agree that Data.List is the right module for a function of type Ord a = [a] - [a], but this introduces * a cyclic dependency between Data.List and Data.Set * a base dependency on containers. What is

Re: [Haskell-cafe] ordNub

2013-10-12 Thread Anthony Cowley
On Oct 12, 2013, at 2:47 PM, Niklas Hambüchen m...@nh2.me wrote: I would like to come back to the original question: How can ordNub be added to base? I guess we agree that Data.List is the right module for a function of type Ord a = [a] - [a], but this introduces * a cyclic

Re: [Haskell-cafe] ordNub

2013-10-12 Thread Niklas Hambüchen
On 12/10/13 20:43, Anthony Cowley wrote: I think nub's behavior is rather set-related, so I don't really understand the objection to putting it in Data.Set. In sets, the order does not matter, while for nub it does. nub:: Eq a = [a] - [a] ordNub :: Ord a = [a] - [a] both do not

Re: [Haskell-cafe] ordNub

2013-10-12 Thread Roman Cheplyaka
* Anthony Cowley acow...@seas.upenn.edu [2013-10-12 15:43:57-0400] On Oct 12, 2013, at 2:47 PM, Niklas Hambüchen m...@nh2.me wrote: I would like to come back to the original question: How can ordNub be added to base? I guess we agree that Data.List is the right module for a

Re: [Haskell-cafe] ordNub

2013-08-26 Thread Niklas Hambüchen
On 14/07/13 20:20, Niklas Hambüchen wrote: As you might not know, almost *all* practical Haskell projects use it, and that in places where an Ord instance is given, e.g. happy, Xmonad, ghc-mod, Agda, darcs, QuickCheck, yesod, shake, Cabal, haddock, and 600 more (see

Re: [Haskell-cafe] ordNub

2013-08-19 Thread AntC
Richard A. O'Keefe ok at cs.otago.ac.nz writes: There are at least four different things that an Ord version might mean: - first sort a list, then eliminate duplicates - sort a list eliminating duplicates stably as you go (think 'merge sort', using 'union' instead of 'merge') -

Re: [Haskell-cafe] ordNub

2013-07-16 Thread Ketil Malde
Francesco Mazzoli f...@mazzo.li writes: import qualified Data.HashSet as S nub :: Hashable a = [a] - [a] nub = S.toList . S.fromList Well, the above is not stable while Niklas’ is. But I guess that’s not the point of your message :). We could also implement Data.BloomFilter.nub, which

Re: [Haskell-cafe] ordNub

2013-07-16 Thread Andreas Abel
On 14.07.2013 13:20, Niklas Hambüchen wrote: I've taken the Ord-based O(n * log n) implementation from yi using a Set: ordNub :: (Ord a) = [a] - [a] ordNub l = go empty l where go _ [] = [] go s (x:xs) = if x `member` s then go s xs

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Clark Gaebel
Apologies. I was being lazy. Here's a stable version: import qualified Data.HashSet as S hashNub :: (Ord a) = [a] - [a] hashNub l = go S.empty l where go _ [] = [] go s (x:xs) = if x `S.member` s then go s xs else x : go (S.insert x

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Niklas Hambüchen
Hey Jason, would you mind giving a short idea of what the point of Bird's implementation is / from what properties it is derived? Also, running the QuickCheck tests you added, it doesn't give the same output (order) as nub. On 15/07/13 13:26, Jason Dagit wrote: Richard Bird has a book, Pearls

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Brandon Allbery
On Sun, Jul 14, 2013 at 7:54 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Oops sorry I guess my point wasn't clear. Why ord based when hashable is faster? Then there's no reason this has to be in base, it can just be a Did the point about stable fly overhead? -- brandon s allbery kf8nh

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Jason Dagit
On Sun, Jul 14, 2013 at 4:20 AM, Niklas Hambüchen m...@nh2.me wrote: tldr: nub is abnormally slow, we shouldn't use it, but we do. As you might know, Data.List.nub is O(n²). (*) As you might not know, almost *all* practical Haskell projects use it, and that in places where an Ord instance

Re: [Haskell-cafe] ordNub

2013-07-15 Thread John Lato
In my tests, using unordered-containers was slightly slower than using Ord, although as the number of repeated elements grows unordered-containers appears to have an advantage. I'm sure the relative costs of comparison vs hashing would affect this also. But both are dramatically better than the

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Ivan Lazar Miljenovic
On 16 July 2013 11:46, John Lato jwl...@gmail.com wrote: In my tests, using unordered-containers was slightly slower than using Ord, although as the number of repeated elements grows unordered-containers appears to have an advantage. I'm sure the relative costs of comparison vs hashing would

Re: [Haskell-cafe] ordNub

2013-07-15 Thread John Lato
On Tue, Jul 16, 2013 at 10:31 AM, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: On 16 July 2013 11:46, John Lato jwl...@gmail.com wrote: In my tests, using unordered-containers was slightly slower than using Ord, although as the number of repeated elements grows

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Clark Gaebel
I'm procrastinating something else, so I wrote the patch to unordered-containers. Feel free to comment on the github link: https://github.com/tibbe/unordered-containers/pull/67 I'm still against having an Ord version, since my intuition tells me that hash-based data structures are faster than

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Brandon Allbery
On Mon, Jul 15, 2013 at 10:31 PM, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: If I understand correctly, this function is proposed to be added to Data.List which lives in base... but the proposals here are about using either Sets from containers or HashSet from

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Richard A. O'Keefe
On 16/07/2013, at 3:21 PM, Clark Gaebel wrote: I'm still against having an Ord version, since my intuition tells me that hash-based data structures are faster than ordered ones. There are at least four different things that an Ord version might mean: - first sort a list, then eliminate

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Clark Gaebel
nubBy is a very good suggestion. Added! Regarding good hash functions: if your data structure is algebraic, you can derive generic and Hashable will give you a pretty good hash function: data ADT a = C0 Int String | C1 [a] deriving Generic instance Hashable a = Hashable (ADT a) It's

Re: [Haskell-cafe] ordNub

2013-07-15 Thread Conrad Parker
On 16 July 2013 10:31, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: On 16 July 2013 11:46, John Lato jwl...@gmail.com wrote: In my tests, using unordered-containers was slightly slower than using Ord, although as the number of repeated elements grows unordered-containers appears to

[Haskell-cafe] ordNub

2013-07-14 Thread Niklas Hambüchen
tldr: nub is abnormally slow, we shouldn't use it, but we do. As you might know, Data.List.nub is O(n²). (*) As you might not know, almost *all* practical Haskell projects use it, and that in places where an Ord instance is given, e.g. happy, Xmonad, ghc-mod, Agda, darcs, QuickCheck, yesod,

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Clark Gaebel
Similarly, I've always used: import qualified Data.HashSet as S nub :: Hashable a = [a] - [a] nub = S.toList . S.fromList And i can't think of any type which i can't write a Hashable instance, so this is extremely practical. On Jul 14, 2013 7:24 AM, Niklas Hambüchen m...@nh2.me wrote: tldr:

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Clark Gaebel
Oops sorry I guess my point wasn't clear. Why ord based when hashable is faster? Then there's no reason this has to be in base, it can just be a free function in Data.HashSet. If stability is a concern then there's a way to easily account for that using HashMap. - Clark On Jul 14, 2013 7:48

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Niklas Hambüchen
One of my main points is: Should we not add such a function (ord-based, same output as nub, stable, no sorting) to base? As the package counting shows, if we don't offer an alternative, people obviously use it, and not to our benefit. (Not to say it this way: We could make the Haskell world

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Roman Cheplyaka
Something like that should definitely be included in Data.List. Thanks for working on it. Roman * Niklas Hambüchen m...@nh2.me [2013-07-14 19:20:52+0800] tldr: nub is abnormally slow, we shouldn't use it, but we do. As you might know, Data.List.nub is O(n²). (*) As you might not know,

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Francesco Mazzoli
At Sun, 14 Jul 2013 07:31:05 -0400, Clark Gaebel wrote: Similarly, I've always used: import qualified Data.HashSet as S nub :: Hashable a = [a] - [a] nub = S.toList . S.fromList And i can't think of any type which i can't write a Hashable instance, so this is extremely practical.

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Joey Adams
On Sun, Jul 14, 2013 at 7:31 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Similarly, I've always used: import qualified Data.HashSet as S nub :: Hashable a = [a] - [a] nub = S.toList . S.fromList And i can't think of any type which i can't write a Hashable instance, so this is extremely

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Conrad Parker
On 15 July 2013 09:54, Joey Adams joeyadams3.14...@gmail.com wrote: On Sun, Jul 14, 2013 at 7:31 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Similarly, I've always used: import qualified Data.HashSet as S nub :: Hashable a = [a] - [a] nub = S.toList . S.fromList And i can't think of any

Re: [Haskell-cafe] ordNub

2013-07-14 Thread Thomas DuBuisson
Just so people are aware - five years ago the notion of nubOrd and nubWith was discussed and a consensus reached on including nubOrd. I think Bart got too busy, didn't submit a final patch, and no one with commit access actually commited any code.