Abstractly, a lot of graph algorithms fall into this paradigm:
foreach node in the graph
neighbors = getNeighbors(graph, node);
doSomething(node, neighbors);
end
I am trying to parallelize this operation for large graphs with
millions of nodes. So, one approach would be to partition
Thanks. I am trying to use protocol buffers with a map reduce
framework to work on large graphs. The graphs are too big to fit in
memory so building auxiliary data structures is difficult. Any ideas?
On Oct 22, 2:48 pm, Jeremy Leader [EMAIL PROTECTED] wrote:
Protocol Buffers are a serialization
Sounds like fun. Just curious what is actually supposed to be encoded
in the protobuf message, if the graph is too big to fit in memory?
Are you sending just a single node, with its edges? It seems like you
wouldn't be able to avoid scenarios where some of the edges in your
message point to
Thanks Jeremy. That worked!
But we now have information about the same node being replicated. For
instance, let's say we have a field 'weight' attached to each node as
shown below. This setup will replicate the weight information of a
node as many times as its degree. If the weight of a node
I was assuming all the properties of a node (weight, label, color,
whatever) would be in UndirectedGraphNode; UndirectedGraphNodeReference
would only have the id and nothing else.
--
Jeremy Leader
[EMAIL PROTECTED]
GDR wrote:
Thanks Jeremy. That worked!
But we now have information about
That does solve the duplicate information problem but it makes updates
to node attributes (like weight) difficult. Let's say, I want to
assign the weight of each node to the average of its neighbors.
for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) {
double sum = 0;
int
my bad. The code snippet should be as follows:
for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) {
double sum = 0;
int count = 0;
for(UndirectedGraphNodeReference neighbor :
node.getNeighborsList() ) {
sum +=
count++;
}
node.setWeight(sum/count);
}
Protocol Buffers are a serialization format, rather than general-purpose
data structures. To do computations, you'd probably want to build some
auxiliary data structures, which you populate when you deserialize the
protobuf data. You could have node objects that resemble your original
Hi,
I'm wondering how would one go about implementing self-referential
data structures? As an exercise, I tried to implement a PB version of
the adjacency list representation of a graph. I'm having a hard time
getting it work. Any suggestions?
Thanks!
--- graph.proto ---
package graph;