[ https://issues.apache.org/jira/browse/GIRAPH-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavan Kumar updated GIRAPH-908: ------------------------------- Attachment: GIRAPH-908.patch > support for partitioned input in giraph > --------------------------------------- > > Key: GIRAPH-908 > URL: https://issues.apache.org/jira/browse/GIRAPH-908 > Project: Giraph > Issue Type: Improvement > Reporter: Pavan Kumar > Assignee: Pavan Kumar > Attachments: GIRAPH-908.patch > > > When the graph we need to work on is already partitioned into a fixed number > of buckets, with properties such as high edge-locality, low fan-out to other > buckets, etc. [for instance using techniques such as > https://people.cam.cornell.edu/~jugander/papers/wsdm13-blp.pdf ] > we should be able to partition our graph based on such a mapping, to improve > local-requests, etc. thus avoiding huge network communication. this diff is > especially useful we repeatedly run algorithms on top of the same graph. In > such case, we can compute the partitioning once & then use it to speed up > processing, (also requiring less network bandwidth, etc.) for the rest > applications on same/ similar graphs. > The diff is big & partly reviewed by my colleagues. Putting it up for review -- This message was sent by Atlassian JIRA (v6.2#6252)