> Niklas Hambüchen <mail <at> nh2.me> writes: > > > On 13/10/13 21:42, AntC wrote: > > ... > > If you use the Set library, that fact may be very visible! > > Because Set re-sequences the whole list, as per its Ord instance. > > > > But List.nub preserves the list sequence > > (except for omitting duplicates). > > I mean *exactly* what you say here. > > ordNub behaves has the same behaviour as nub, while (Set.toList . > Set.fromList) doesn't. >
That's great, thank you. > > [BTW I am still less than convinced that overall a Set-based ordNub is > > significantly more efficient. I suspect it depends on how big is your > > list.] > > What do you mean? > > ordNub is clearly in a different complexity class, ... Yes, I'm not disputing that. > ... and the benchmarks that I provided show not only this, > but also that ordNub is *always* faster than nub, > even for singleton lists. Thanks Niklas, I hadn't spotted those benchmarks back in July. I'm surprised at that result for singletons (and for very small numbers of elements which are in fact each different). Especially because List's `nub` uses `filter` ==> fold, which should be tail-recursive. It seems to me that for small numbers, the Set-based approach still requires comparing each element to each other. Plus there's the overhead for building the Set and inserting each element into it -- where `insert` again walks the Set to find the insertion point. Then here's a further possible optimisation, instead of making separate calls to `member` and `insert`: * Make a single call to insert' :: (Ord a) => a -> Set a -> (Bool, Set a) * The Bool returns True if already a member. * Else returns an updated Set in the snd, with the element inserted. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe