Re: Container hierarchy vs. container types

Steven Schveighoffer Fri, 05 Mar 2010 05:02:22 -0800

On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:

Needless to say, I'd be very curious to hear other opinions in thismatter. Please speak up - the future of D depends, in small part, onthis too.

As you know, you and I disagree on some aspects of containers. However, Ithink our philosophies are converging.

Having built a container library, I can tell you the one and only reason Imade it have an interface hierarchy -- Tango. Tango had a containerlibrary, which was based on ancient Doug Lea collections, with some addedfeatures. Because I intended dcollections to replace Tango's collections,I tried to encompass the same feature set that it had. One of thosefeatures was the ability to use interfaces without knowing theimplementation. Since then, Tango has replaced their container collectionwith something different, and guess what -- no more interface hierarchy.They have a single interface ICollection, and all containers implementthat interface. No Map interface, or List interface or anything.

However, although I think generic containers can avoid *requiring* aninterface, if you design your classes well, slapping on an interface costsalmost nothing. It depends on whether you wish to use classes or structsto implement containers. I still think classes are better, becausemodifying one aspect of a class is easy to do. If someone wants to make anew type of HashMap that does everything my HashMap does, except changesone little bit, it's really easy. With the advent of alias this, it'salso possible to do something similar with structs, but not asstraightforward, and not without recompiling. Also, passing containersaround by value by default is one of the aspects of STL that I thinksucks. When working with STL, I almost never passed around a container byvalue, I always used a reference, because passing by value can incur largehidden-allocation costs.

I'll go over a quick set of points that are pro-interface. First, usingan interface hides the implementation. It may not be possible to haveyour code on display for the compiler to use. Using an interface is aperfectly acceptable way to hide proprietary code that you cannot legallydivulge. This is probably the weakest of the points, but I put it outthere. Second, D is a statically compiled language, but with the(hopefully soon) evolution to dynamic linking, using an interface isideal. If you for instance wish to pass a map to or from a pluginlibrary, using an interface is probably the best way to do it. Interfacesare less susceptible to implementation changes/differences. Third, codethat uses an interface is compiled once per interface. Code that usesduck typing is compiled once per set of arguments. While this might notseem like much, it can reduce the footprint of generated code. Using ducktyping, you may have two almost identical generated functions that differonly by the function addresses used. Finally, interfaces simplifyunderstanding. Once you have used an interface, you know "oh yeah, thisis a map, so I can use it like a map." You can strive to build acontainer library that follows those principles, even making assertionsthat force the compiler to prove those principles, but it's not as easyfor a person to understand as it is to look at interface documentation andknow what it does. This becomes important when using libraries that usespecial implementations of containers. Like for instance a databaseresult or an XML tree.

Interfaces in other languages can be viewed as advantageous in other ways,but D has advanced compile-time interfaces so far that those don't reallymatter in D. For example, declaring that a function requires a mapcontainer can be done with duck typing via conditional compilation.

At the same time, just like I think ranges don't fit every model (*cough*I/O), interfaces aren't the answer to every aspect of containers. I don'tthink ranges fit well with interfaces, because iterating interface rangesprevents inlining -- the major draw of ranges in the first place -- andranges are so much more useful with value semantics. I also thinkfunctions that can be tuned to each implementation should be. For thisreason, dcollections containers provide a lot of functionality that is notincluded in the interfaces, simply because the functions are so specificto the implementation, it would be the only class that implemented thatinterface. For example, all the functions that return cursors (and soonranges) are not interface functions. This doesn't make them useless viainterfaces, but you cannot use every aspect of the container via aninterface. An interface is like a common denominator, and I think itshould be useful for some purposes. If nobody will ever use the interfaceas a parameter to a function, then there is no point in declaring theinterface (I realize that I have created such interfaces in dcollectionsand I plan to correct that -- one nice benefit of the contemplationtriggered by this discussion).

I am working on updating dcollections as we post, and I think I have comeup with a nifty way to do ranges that will both retain the usefulness ofthe cursor, and provide a common way to plug the collections intostd.algorithm.

Good luck with your containers, I still hold out hope that dcollectionscan be integrated in Phobos, but I probably need to get it working inorder to have any chance of competing :)


-Steve

Re: Container hierarchy vs. container types

Reply via email to