Re: [Pharo-project] [squeak-dev] Re: Graph library in Smalltalk - Need for advices

Alexandre Bergel Tue, 21 Dec 2010 16:49:27 -0800

> Stéphane, regarding Mondrian - I've looked over the code and while it's ok
> for what it does, I think everyone would admit it's not designed to be a
> real graph library. It does what it needs to do and no more, so I would keep
> it that way for now. You really do need to design a good graph lib from the
> ground up.


Mondrian does not mean to replace GraphViz or anything like this. 
Note that there is a bridge Mondrian <--> GraphViz.

Cheers,
Alexandre


> 
> It's been my experience after using probably 20 different graph libraries in
> real projects over the years that the following themes reoccur:
> 
> 1. Library is good for visualization, but not for calculations - shortest
> path, centrality, MST, etc.
> 
> 2. Library is good for algorithms but has no visualization support
> 
> 3. Graph library relies too much on other libraries that are not that
> impressive either. Graphviz comes to mine. Great tool, but frankly the
> visualizations looks awful by today's standards and it's harder to do
> something fully dynamic. Piping a dot file to a unix process to produce a
> static image is not terribly useful in things like web applications.
> 
> 4. Library does not scale predictably or to a certain point. 
> 
> 5. Library does not support persistence properly.
> 
> 6. Library forces an API that takes over your data too much or doesn't play
> well with existing data. Any model that forces inheritance for Vertex and
> Edge classes comes to mind. See neo4j for some examples of a poor API on
> this point.
> 
> 7. Library is good all around but lacks depth (missing algorithms, measures,
> etc) and does not play well with others.
> 
> 8. Every library has its own idea of what a vertex and edge should be and
> there's no good way of passing data between them in a nice manner.
> 
> 9. Visualization libraries have a multitude of issues - specific to desktop
> or web only, static output, no capability to annotate vertex or edges, no
> pruning... I have yet to find one that is both pretty and useful to this day
> without nearly recreating a new library each time.
> 
> The reality is every library will have to make compromises at a minimum in
> the following areas:
> 
> 1. Scalability
> 2. Data structures
> 3. Algorithms
> 4. Performance
> 5. Visualization
> 
> I think you need to at least consider all the above when creating a library.
> In fact, you probably should break it up as many libraries do into several
> components:
> 
> 1. Common, useful data structures (Vertex, Edge, Adjacency List, Adjaceny
> Matrix, etc.) that are optimized for your language - hopefully smalltalk
> 
> 2. Computations / Measurements
> 
> 3. Algorithms - layout, common graph problems, etc.
> 
> 4. Persistence - database or otherwise, likely several supporting libs
> specific to the persistence mechanism or implementing some kind of pattern
> to be agnostic. The data structures should at least be designed in such a
> way that they are persistence friendly to put in something like Gemstone,
> Magma, Image, etc. See for example InfiniteGraph which has a Java interface
> built on Objectivity.
> 
> 5. Visualization - desktop, web, and static output.
> 
> The problem here is that creating a graph library is a huge undertaking. It
> might not sound like it, but they can grow to epic proportions. From my
> comments, you can see that it is really not the fault of any particular lib,
> but rather the subject matter. You could spend a lifetime packing in
> features. The real key is just to create a series of libs that work well
> together and not constantly reinvent the wheel with node classes in each
> library. 
> 
> A secondary problem is doing graph analysis and even drawing graphs can take
> a lot of horse power. Parallelism is a tough issue with graph libraries and
> there's a multitude of approaches from pure threads to map/reduce to random
> walking and stream processing.  This is further compounded by persistence
> where you need to start doing things like graph healing, partitioning, and
> sharding to scale to massive levels and maintain performance.
> 
> I just want to mention to others that graphs are hugely useful in general to
> solve a variety of problems from recommendations to meta programming. It's
> not just pretty pictures and experiments with molecules. There's a lot of
> potential in Smalltalk to do something since generally I feel Smalltalkers
> aren't bound by or afraid of new approaches to old problems. Graph databases
> in many cases could replace relational DBs for example and let your app do
> all kinds of stuff you might never have thought of otherwise.
> 
> I could go on and on, but I'll stop myself here. Comments or thoughts?
> 
> 
> 
> 
> 
> 
> Stéphane Ducasse wrote:
>> 
>>> 
>>> I've started looking at the exemples YossiDM gave to me and in particular
>>> Lemon which was according to him his best experience. I found the model
>>> quite clear and covering all what I expect for a generic graph lib
>>> (directed, undirected, mapping concept, iterators, and algorithms of
>>> course). Moreover and contrary to Boost, it's still developed in 2010.
>>> 
>>> To be more precise, here's what I expect for a generic graph lib in
>>> smalltalk (note all in Lemon except visualization):
>>> 
>>> - data structure: directed graphs, undirected graphs, possible loop and
>>> parallel edged, ..., trees (?)
>>> - mapping: easily map objects, informations on nodes and/or edges (here,
>>> don't know if I'd like subclassing nodes/eges instead...)
>>> - iterators: efficient way to iterate over nodes and edges
>>> - algorithms: basic algorithms implementation (bfs, dfs, ..., shortest
>>> paths, ...), and plug-ability for specific ones...
>>> - visualization: having an interactive graph visualization web/SVG and
>>> eventually morphic (... graphviz, mondrian, .......)
>>> 
>>> then, I could use this for my research work...
>>> - I need "belief" nodes with associated conditional beliefs tables
>>> - I need to implement algorithms to propagate an information change
>>> (change of a node state) in any nodes... 
>>> ****mainly, I'd like to get junction trees from a graph [1] which are
>>> rely useful for several domains in machine learning field ***
>>> 
>>> Actually, I don't know if I really need a graph lib as a simple
>>> implementation directed to bayesian should be enough but it's the second
>>> time I need graphs (last time was for planification) and I think that
>>> would be great to have a nice and clean basic implementation.
>>> 
>>> Couldn't we start developing something similar to Lemon (regarding "API",
>>> enitites, etc...) that would work for small scale project project in
>>> smalltalk ?
>> 
>> It would be excellent.
>> Because now that you have a full time permanent position you can invest a
>> bit and in 2 years you can get something really sexy....
>> This is what we are doing all the time around pharo.
>> 
>>> Yossi, what were the limitations you found with Lemon ?
>>> 
>>> Cheers,
>>> 
>>> Cédrick
>>> 
>> 
>> 
>> 
>> 
> 
> -- 
> View this message in context: 
> http://forum.world.st/Graph-library-in-Smalltalk-Need-for-advices-tp3092747p3159886.html
> Sent from the Pharo Smalltalk mailing list archive at Nabble.com.
> 

-- 
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.

Re: [Pharo-project] [squeak-dev] Re: Graph library in Smalltalk - Need for advices

Reply via email to