On 05/06/2013 07:09 PM, H. S. Teoh wrote: > I would be interested in something like this in D. Though in my case, > I'm not so much interested in the graph data structures themselves, as > the graph algorithms that apply to them, that can be used to solve > various programming problems, like connectivity analysis of large data > sets, reachability analysis, and the like.
Sure. The data structures are a tool to let you implement those algorithms really well ... :-) > Graph coloring algorithms are > also of some interest to me, since they underlie some of the > visualization techniques that I'm interested in. This would be very nice to have. It's not really my direct area of interest, but it would be lovely to have a nice visualization solution as part of the library. > So what I envision as far as graph libraries in D are concerned is > something like the generic range API, where there is a fixed convention > as to what constitutes a graph/digraph/etc., and the library itself is a > template library that is able to handle any data type that conforms to > the requirements. Of course, this does not preclude specific, > non-generic data structures used internally by said graph algorithms, > but for maximal usability, the API shouldn't require the user to perform > large-scale conversion of existing data structures into graph-specific > structures just so he can run, say, a reachability analysis on it. When > working with large data sets, one really wants to just map class X to > nodes, class Y to edges, now give me the connectivity of the resulting > graph (rather than looping over the existing data set to construct a > separate graph representation thereof). That's a nice way of looking at it. Probably the basic graph interface would be one which allows a few basic questions to be asked -- "give me a range of all the nodes in the graph", "give me a range of all the edge pairs", etc., and you'd want properties like isGraph, isDirectedGraph, isMultiGraph, isBipartiteGraph, ... ... and similarly you'd want a node or edge interface which allows for similar properties. > I'm interested. Thank you :-) > Some years ago I implemented Tarjan's algorithm for computing > strongly-connected components in C++. I needed to use it on an > n-dimensional grid of cells, so I tried to make it generic, but due to > limitations in C++ generic programming, I ended up just using an > abstract class for representing the graph. This required implementing a > separate graph class for every data structure I wanted to run the > algorithm on, which is rather ugly. If something like this were done in > D, I expect it would have a friendlier API by allowing Phobos-style > genericity (using range-like abstraction for graph concepts like nodes > and edges, alias for callbacks/closures to abstract away graph > implementation details, etc.). One could, for example, allow overriding > of methods that produce the SCC graph so that the result is produced in > a custom target type that can be used directly by the application > elsewhere. Sounds fun! :-) > This is a bit beyond what I need, but if we can have it, why not? >From my point of view speed and space efficiency are essential because if I continue down the current planned research direction, I'll probably wind up working on graphs with hundreds of thousands of nodes and links. > How would you go about doing this, though? Wouldn't it require some kind > of wrapper API to make it compatible with the Python interpreter? As Ellery suggested, PyD should be our friend here. But I don't think interfacing with Python is an essential short-term goal. > I will be glad to have generic graph algorithms available in D, to > minimize the need for reinventing the wheel. I've seen too much code > that tries to reinvent known graphing algorithms (usually rather poorly) > just because existing graphing libraries are too hard / too obscure to > reuse. Well, here's where I am right now -- I'm currently heading towards a writeup of some fun stuff with colleagues, which will probably take a few weeks. Assuming that we are going to follow up on this, there's a reasonable case for doing this kind of graph work. So, I'll continue to read and think -- and we can continue to discuss here! -- and hopefully we can get towards some kind of action plan. One remark on licensing. I'm conscious of the fact that there is probably great advantage (from a code point of view) in careful examination of the internals of existing graph solutions. On the other hand the most highly performing rival, igraph, is GPL-licensed. This obviously creates a potential problem where (conscious or unconscious) copying or influence is concerned, because it would probably be nice to continue the typical D licensing approach and use Boost. Not that I have any personal objection to the GPL -- it's a license that I would be inclined to choose for many projects -- but where such a generally useful library is concerned, my inclination is to be permissive. Anyone else have any thoughts on this?
