Thanks. I am trying to use protocol buffers with a map reduce framework to work on large graphs. The graphs are too big to fit in memory so building auxiliary data structures is difficult. Any ideas?
On Oct 22, 2:48 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote: > Protocol Buffers are a serialization format, rather than general-purpose > data structures. To do computations, you'd probably want to build some > auxiliary data structures, which you populate when you deserialize the > protobuf data. You could have node objects that resemble your original > .proto file, where nodes have references to their neighbors, and you'd > probably need a map from node id to node object reference, which you'd > use during deserialization. > > -- > Jeremy Leader > [EMAIL PROTECTED] > > GDR wrote: > > my bad. The code snippet should be as follows: > > > for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) { > > double sum = 0; > > int count = 0; > > for(UndirectedGraphNodeReference neighbor : > > node.getNeighborsList() ) { > > sum += ???? > > count++; > > } > > node.setWeight(sum/count); > > > } > > > ----- graph.proto ----- > > > package graph; > > > option java_package = "graph"; > > option java_outer_classname = "UndirectedGraphType"; > > option optimize_for = CODE_SIZE; > > > message UndirectedGraphNodeReference { > > required string id = 1; > > } > > > message UndirectedGraphNode { > > required string id = 1; > > required double weight = 2; > > repeated UndirectedGraphNodeReference neighbors = 2; > > } > > > message UndirectedGraph { > > repeated UndirectedGraphNode nodes = 1; > > } > > > On Oct 22, 2:36 pm, GDR <[EMAIL PROTECTED]> wrote: > >> That does solve the duplicate information problem but it makes updates > >> to node attributes (like weight) difficult. Let's say, I want to > >> assign the weight of each node to the average of its neighbors. > > >> for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) { > >> double sum = 0; > >> int count = 0; > >> for(UndirectedGraphNodeReference neighbor : > >> node.getNeighborsList() ) { > >> sum += ???? > >> count++; > >> } > >> node.setWeight(sum/count); > > >> } > > >> ----- graph.proto ----- > > >> package graph; > > >> option java_package = "graph"; > >> option java_outer_classname = "UndirectedGraphType"; > >> option optimize_for = CODE_SIZE; > > >> message UndirectedGraphNodeReference { > >> required string id = 1; > >> required double weight = 2; > > >> } > > >> message UndirectedGraphNode { > >> required string id = 1; > >> repeated UndirectedGraphNodeReference neighbors = 2; > > >> } > > >> message UndirectedGraph { > >> repeated UndirectedGraphNode nodes = 1; > > >> } > > >> On Oct 22, 2:14 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote: > > >>> I was assuming all the properties of a node (weight, label, color, > >>> whatever) would be in UndirectedGraphNode; UndirectedGraphNodeReference > >>> would only have the id and nothing else. > >>> -- > >>> Jeremy Leader > >>> [EMAIL PROTECTED] > >>> GDR wrote: > >>>> Thanks Jeremy. That worked! > >>>> But we now have information about the same node being replicated. For > >>>> instance, let's say we have a field 'weight' attached to each node as > >>>> shown below. This setup will replicate the weight information of a > >>>> node as many times as its degree. If the weight of a node changes, I > >>>> will have update all it's occurrences in the PB. Any way I can avoid > >>>> it? > >>>> package graph; > >>>> option java_package = "graph"; > >>>> option java_outer_classname = "UndirectedGraphType"; > >>>> option optimize_for = CODE_SIZE; > >>>> message UndirectedGraphNodeReference { > >>>> required string id = 1; > >>>> required double weight = 2; > >>>> } > >>>> message UndirectedGraphNode { > >>>> required string id = 1; > >>>> repeated UndirectedGraphNodeReference neighbors = 2; > >>>> } > >>>> message UndirectedGraph { > >>>> repeated UndirectedGraphNode nodes = 1; > >>>> } > >>>> On Oct 21, 6:37 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote: > >>>>> Keep in mind that protobufs describe serialized data, and there's no > >>>>> concept of an object reference like Java uses. In your example, if A > >>>>> and B are neighbors, then in your proto, the data representing A > >>>>> contains the data representing B, and the data representing B contains > >>>>> the data representing A! > >>>>> One way around this is to implement your own form of references, perhaps > >>>>> using the node ids like this: > >>>>> package graph; > >>>>> option java_package = "graph"; > >>>>> option java_outer_classname = "UndirectedGraph"; > >>>>> option optimize_for = CODE_SIZE; > >>>>> message UndirectedGraphNodeReference { > >>>>> required string id = 0; > >>>>> } > >>>>> message UndirectedGraphNode { > >>>>> required string id = 0; > >>>>> repeated UndirectedGraphNodeReference neighbors; > >>>>> } > >>>>> message UndirectedGraph { > >>>>> repeated UndirectedGraphNode nodes; > >>>>> } > >>>>> -- > >>>>> Jeremy Leader > >>>>> [EMAIL PROTECTED] > >>>>> GDR wrote: > >>>>>> Hi, > >>>>>> I'm wondering how would one go about implementing self-referential > >>>>>> data structures? As an exercise, I tried to implement a PB version of > >>>>>> the adjacency list representation of a graph. I'm having a hard time > >>>>>> getting it work. Any suggestions? > >>>>>> Thanks! > >>>>>> --- graph.proto --- > >>>>>> package graph; > >>>>>> option java_package = "graph"; > >>>>>> option java_outer_classname = "UndirectedGraph"; > >>>>>> option optimize_for = CODE_SIZE; > >>>>>> message UndirectedGraphNode { > >>>>>> required string id = 0; > >>>>>> repeated UndirectedGraphNode neighbors; > >>>>>> } > >>>>>> message UndirectedGraph { > >>>>>> repeated UndirectedGraphNode nodes; > >>>>>> } --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---