Re: [Neo] Introduction and some thoughts about the coming algorithm component

Marko A. Rodriguez Fri, 23 May 2008 07:16:57 -0700

Hi Patrick,

My name is Marko Rodriguez.

Your problem of calculating N^(2logN) algorithms (e.g. betweenness) on a
massive graph will not go away by simply placing things in memory :).
Also, a relaxation of your depth isn't going to solve much as natural
network have very small diameters -- the entire World Wide Web citation
graph has a diameter of 16 (and thats considered quite big). As we all
know from the Millgram experiments, 6 degrees is all you get from one
person to another in a social network. Thus, once your traverser hits a
super node (highly connected node) ..... slooooooooooowwww...

Furthermore, you are also dealing with the problem of how to implement a
single-relational network analysis algorithms (i.e. unlabeled graph
algorithms -- the standard in nearly all textbooks) on a multi-relational
network (i.e. labeled or semantic graph). Neo being multi-relational and
all popular network analysis algorithms being designed for
single-relational networks.

While a multi-relational network is indeed a collection of
single-relational network, where each single-relational "slice" has its
own meaning (e.g. KNOWS, WORKS_WITH, etc.), you simply can't calculate a
centrality measure on a single single-relational "slice" if you expect to
do any sort of interesting computation. For instance, lets say you want to
calculate the centrality of a cocitation network between scholars but all
you have are WROTE (maps human to article) and CITES (maps article to
article) edges. In such cases, you will need to move in the graph in more
interesting ways... You will need to take a WROTE, CITES, WROTE^-1 and
loop on that pattern.... This is a simple example, and for more
complicated semantics, you will need a traverser that is nearly Turing
complete.

Take care,
Marko.

> Peter Neubauer wrote:
>> Hi Patrik,
>> there is a lot of commits in your lab, maybe it would be cool to have
>> a small intro on what you are doing and what the status of the algos
>> are?
>>
>> /peter
>>
> Hi all!
>
> I actually intended to properly introduce myself when my work is added
> as a component (which was supposed to happen soon), anyway...
>
> For everyone who does not know about my work, I am working on
> implementing some graph algorithms for Neo. So far the focus has been on
> finding shortest paths and using that ability to calculate different
> centrality measures for nodes in a Neo network.
>
> So far regarding shortest paths, there are implementations of Dijkstra
> to find paths between two given nodes (uses bidirectional search) and
> also the classic problem of finding the shortest paths from one source
> node to all others. For unweighted networks, where the relationships
> have an implicit cost of 1, there is a breadth first search which only
> differs from Neos BFS traverser in the way that it can find several
> paths to each node.
>
> Regarding measures and centrality there is so far support for stress
> centrality, betweenness centrality, closeness centrality, eccentricity,
> network radius and network diameter. (For definitions of these, I refer
> to wikipedia).
>
> What I am working on right now, except for properly documenting what i
> have done so far, is experimenting with cutting out parts of a network
> (like a subgraph). Since I would appreciate some thoughts/feedback on
> this I will here describe it in more detail.
>
> The issue is this. When doing calculations, doing it on the entire
> network would just be too much data to handle and take too much time so
> we need some way of limiting what part of the network we run the
> calculation on. One way of doing this (partly already supported) is to
> give each algorithm some kind of limit, for example we can provide our
> BFS search with a maximum depth to traverse or our Dijkstra with a
> maximum number of relationships to relax/consider.
>
> The other way of doing this is to first create an in-memory copy of the
> relevant part of the network, which is what I am working on, and then
> release the calculation on that.
>
> A number of questions arise, like if we would like to keep relationship
> types (makes most sense) or be able to rearrange/rename them or truncate
> them into one combined type.
>
> The most important question however is; how much should we detach this
> copy from the real network?
>
> Should we make an entirely detached copy so we can make changes to it
> without affecting the "real" network, for example be able to create the
> transitive closure or similar? Or should we assume that the algorithm
> beeing used is supposed to make some change to the real network so we
> need to reflect these changes down to it?
>
>
>
> Any feedback and thoughts on this or any other part of my work
> (currently in
> https://svn.neo4j.org/laboratory/users/patrik/neoalgo/trunk/) is much
> appreciated.
>
> This mail became a bit longer than expected, sorry, but thats the way it
> happened ;)
>
> Regards
> Patrik Larsson
>
> _______________________________________________
> Neo mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>

_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Introduction and some thoughts about the coming algorithm component

Reply via email to