[
https://issues.apache.org/jira/browse/GIRAPH-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416534#comment-13416534
]
Alessandro Presta commented on GIRAPH-256:
------------------------------------------
Agreed on the rename, MAX_VERTICES_PER_PARTITION fooled me too :)
I wonder if we can even skip creating temporary Partition objects and use plain
lists and ids.
Two small changes I would make:
- MAX_*_PER_TX => MAX_*_PER_TRANSFER
- txRegulator => numEdgesInTempPartition or something like that, I think it
would easier to understand
Other than that, I already gave you thumbs up on GIRAPH-247.
> Partitioning outgoing graph data during INPUT_SUPERSTEP by # of vertices
> results in wide variance in RPC message sizes
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: GIRAPH-256
> URL: https://issues.apache.org/jira/browse/GIRAPH-256
> Project: Giraph
> Issue Type: Improvement
> Components: bsp, graph
> Affects Versions: 0.2.0
> Reporter: Eli Reisman
> Assignee: Eli Reisman
> Labels: patch
> Fix For: 0.2.0
>
> Attachments: GIRAPH-256-1.patch, GIRAPH-256-2.patch,
> GIRAPH-256-3.patch
>
>
> This relates to GIRAPH-247. The unfortunately named
> "MAX_VERTICES_PER_PARTITION" fooled me into thinking this value was
> regulating the size of initial Partition objects as they were composed during
> INPUT_SUPERSTEP from InputSplits each worker reads.
> In fact this configuration option only regulates the size of the outgoing RPC
> messages, stored locally in Partition objects but decomposed into Collections
> of BasicVertex for transfer to their eventual homes on another (or this)
> worker. There they are combined into the actual Partitions they will exist in
> for the job run.
> By partitioning these outgoing messages by # of vertices, metrics load tests
> have shown the size of the average message is not well regulated and can
> create overloads on either side of these transfers. This is important because:
> 1. Throughput and memory are at a premium during INPUT_SUPERSTEP.
> 2. Only one crashed worker in a Giraph job causes cascading job failure, even
> in an otherwise healthy workflow.
> This JIRA renames the offending variables/config options and further
> regulates outgoing graph data in INPUT_SUPERSTEP by the # of edges and THEN
> the # of vertices in a candidate for transfer. This much more effectively
> regulates message size for typical social graph data and has been show in
> testing to greatly improve the amount of load-in data Giraph can handle
> without failure given fixed memory and worker limits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira