Thanks. I am trying to use protocol buffers with a map reduce
framework to work on large graphs. The graphs are too big to fit in
memory so building auxiliary data structures is difficult. Any ideas?
On Oct 22, 2:48 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote:
> Protocol Buffers are a serialization format, rather than general-purpose
> data structures. To do computations, you'd probably want to build some
> auxiliary data structures, which you populate when you deserialize the
> protobuf data. You could have node objects that resemble your original
> .proto file, where nodes have references to their neighbors, and you'd
> probably need a map from node id to node object reference, which you'd
> use during deserialization.
>
> --
> Jeremy Leader
> [EMAIL PROTECTED]
>
> GDR wrote:
> > my bad. The code snippet should be as follows:
>
> > for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) {
> > double sum = 0;
> > int count = 0;
> > for(UndirectedGraphNodeReference neighbor :
> > node.getNeighborsList() ) {
> > sum += ????
> > count++;
> > }
> > node.setWeight(sum/count);
>
> > }
>
> > ----- graph.proto -----
>
> > package graph;
>
> > option java_package = "graph";
> > option java_outer_classname = "UndirectedGraphType";
> > option optimize_for = CODE_SIZE;
>
> > message UndirectedGraphNodeReference {
> > required string id = 1;
> > }
>
> > message UndirectedGraphNode {
> > required string id = 1;
> > required double weight = 2;
> > repeated UndirectedGraphNodeReference neighbors = 2;
> > }
>
> > message UndirectedGraph {
> > repeated UndirectedGraphNode nodes = 1;
> > }
>
> > On Oct 22, 2:36 pm, GDR <[EMAIL PROTECTED]> wrote:
> >> That does solve the duplicate information problem but it makes updates
> >> to node attributes (like weight) difficult. Let's say, I want to
> >> assign the weight of each node to the average of its neighbors.
>
> >> for(UndirectedGraphNode node : UndirectedGraph.getNodesList() ) {
> >> double sum = 0;
> >> int count = 0;
> >> for(UndirectedGraphNodeReference neighbor :
> >> node.getNeighborsList() ) {
> >> sum += ????
> >> count++;
> >> }
> >> node.setWeight(sum/count);
>
> >> }
>
> >> ----- graph.proto -----
>
> >> package graph;
>
> >> option java_package = "graph";
> >> option java_outer_classname = "UndirectedGraphType";
> >> option optimize_for = CODE_SIZE;
>
> >> message UndirectedGraphNodeReference {
> >> required string id = 1;
> >> required double weight = 2;
>
> >> }
>
> >> message UndirectedGraphNode {
> >> required string id = 1;
> >> repeated UndirectedGraphNodeReference neighbors = 2;
>
> >> }
>
> >> message UndirectedGraph {
> >> repeated UndirectedGraphNode nodes = 1;
>
> >> }
>
> >> On Oct 22, 2:14 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote:
>
> >>> I was assuming all the properties of a node (weight, label, color,
> >>> whatever) would be in UndirectedGraphNode; UndirectedGraphNodeReference
> >>> would only have the id and nothing else.
> >>> --
> >>> Jeremy Leader
> >>> [EMAIL PROTECTED]
> >>> GDR wrote:
> >>>> Thanks Jeremy. That worked!
> >>>> But we now have information about the same node being replicated. For
> >>>> instance, let's say we have a field 'weight' attached to each node as
> >>>> shown below. This setup will replicate the weight information of a
> >>>> node as many times as its degree. If the weight of a node changes, I
> >>>> will have update all it's occurrences in the PB. Any way I can avoid
> >>>> it?
> >>>> package graph;
> >>>> option java_package = "graph";
> >>>> option java_outer_classname = "UndirectedGraphType";
> >>>> option optimize_for = CODE_SIZE;
> >>>> message UndirectedGraphNodeReference {
> >>>> required string id = 1;
> >>>> required double weight = 2;
> >>>> }
> >>>> message UndirectedGraphNode {
> >>>> required string id = 1;
> >>>> repeated UndirectedGraphNodeReference neighbors = 2;
> >>>> }
> >>>> message UndirectedGraph {
> >>>> repeated UndirectedGraphNode nodes = 1;
> >>>> }
> >>>> On Oct 21, 6:37 pm, Jeremy Leader <[EMAIL PROTECTED]> wrote:
> >>>>> Keep in mind that protobufs describe serialized data, and there's no
> >>>>> concept of an object reference like Java uses. In your example, if A
> >>>>> and B are neighbors, then in your proto, the data representing A
> >>>>> contains the data representing B, and the data representing B contains
> >>>>> the data representing A!
> >>>>> One way around this is to implement your own form of references, perhaps
> >>>>> using the node ids like this:
> >>>>> package graph;
> >>>>> option java_package = "graph";
> >>>>> option java_outer_classname = "UndirectedGraph";
> >>>>> option optimize_for = CODE_SIZE;
> >>>>> message UndirectedGraphNodeReference {
> >>>>> required string id = 0;
> >>>>> }
> >>>>> message UndirectedGraphNode {
> >>>>> required string id = 0;
> >>>>> repeated UndirectedGraphNodeReference neighbors;
> >>>>> }
> >>>>> message UndirectedGraph {
> >>>>> repeated UndirectedGraphNode nodes;
> >>>>> }
> >>>>> --
> >>>>> Jeremy Leader
> >>>>> [EMAIL PROTECTED]
> >>>>> GDR wrote:
> >>>>>> Hi,
> >>>>>> I'm wondering how would one go about implementing self-referential
> >>>>>> data structures? As an exercise, I tried to implement a PB version of
> >>>>>> the adjacency list representation of a graph. I'm having a hard time
> >>>>>> getting it work. Any suggestions?
> >>>>>> Thanks!
> >>>>>> --- graph.proto ---
> >>>>>> package graph;
> >>>>>> option java_package = "graph";
> >>>>>> option java_outer_classname = "UndirectedGraph";
> >>>>>> option optimize_for = CODE_SIZE;
> >>>>>> message UndirectedGraphNode {
> >>>>>> required string id = 0;
> >>>>>> repeated UndirectedGraphNode neighbors;
> >>>>>> }
> >>>>>> message UndirectedGraph {
> >>>>>> repeated UndirectedGraphNode nodes;
> >>>>>> }
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---