Hi Sean, Since I've raised this issue before as well, I decided to write some tests.
At this time, I've written/run criterion tests, and I have some hand-wavey theoretical arguments. The theoretical stuff came first (presented below) and isn't influenced by the test results, which I haven't attempted to analyze beyond the most superficial level. If anyone else really cares strongly, please feel free to develop this further. https://github.com/JohnLato/dlist-test/ http://htmlpreview.github.io/?https://github.com/JohnLato/dlist-test/blob/master/testout.html I think more research is warranted, since there are a few anomalies I can't currently explain. On Mon, Nov 18, 2013 at 7:15 AM, Sean Leather <sean.leat...@gmail.com>wrote: > Hi Joachim, > > On Mon, Nov 18, 2013 at 11:21 AM, Joachim Breitner wrote: > > Am Montag, den 18.11.2013, 10:04 +0200 schrieb Sean Leather: >> > * Maintenance and development taken over by Sean Leather >> > * Migrate repository from http://code.haskell.org/~dons/code/dlist/ to >> > https://github.com/spl/dlist >> > * Add `Eq`, `Ord`, `Read`, `Show`, `Alternative`, `Foldable`, >> `Traversable` >> > instances >> >> Given that the point of dlist is to speed up code where lists are >> insufficient, > > > To be a bit more precise, it is not that lists are "insufficient," the > problem is the `(++)` operator (a.k.a. append). To be even more precise, > the problem is left-nested appends, e.g. the expression `(x ++ y) ++ z` may > have a worse traversal time than `x ++ (y ++ z)`. Such an arrangement can > (and probably will) result in multiple traversals of the left argument(s). > > would it make sense to only provide those instances that >> can be implemented without converting to and from lists? >> > > In my opinion, no. It makes sense to have as many reasonable instances as > possible to make the library more attractive and usable. The fact that the > instances convert to and from lists does not detract from their usefulness > because the conversions are not necessarily inefficient (see next response). > > If there are instances that cannot be implemented idiomatically with >> dlists, maybe they should be left out, to signal the user that he will >> not get the benefits of DLists for these. >> > > I think it is an unproven myth that conversion between lists and dlists is > always inefficient. Consider the conversion functions: > > > fromList :: [a] -> DList a > > fromList = DL . (++) > > > toList :: DList a -> [a] > > toList = ($[]) . unDL > > Converting from a list is like prepending (++) to a list. This introduces > a linear traversal of the argument (assuming a complete evaluation of the > converted list). > > Converting to a list is like "finishing it off" with an empty list. The > operation itself is trivial, but traversing the result would evaluate all > the `(++)` from the previous `fromList`s. Fortunately, all of the > `fromList`ed lists are left-arguments to `(++)`, so each will only be > traversed once (which is the primary reason of dlists). > This overlooks the cost of evaluating the DList function itself, which can be significant. DLists are usually formed by snoc'ing/appending k elements/chunks, which means that the total DList is a composition of k separate functions. This structure must be traversed in order to evaluate the resulting list, which makes 'toList' have complexity O(k). If the DList was formed by repeated 'snoc' calls (maybe a common case, maybe not), k==n. calling 'toList' isn't so bad on its own, as typically it only happens once. My concern stems from situations such as using DList as a key in a map, for which many comparisons will be performed and toList will be called multiple times on some elements. Even considering this, I'm not particularly opposed to these instances, but I think users should be aware that they can lead to repeated non-trivial computations. John L.
_______________________________________________ Haskell-platform mailing list Haskell-platform@projects.haskell.org http://projects.haskell.org/cgi-bin/mailman/listinfo/haskell-platform