Repository: spark Updated Branches: refs/heads/master a6394bc2c -> 5e3ec1110
[Minor] Fix comments for GraphX 2D partitioning strategy The sum of vertices on matrix (v0 to v11) is 12. And, I think one same block overlaps in this strategy. This is minor PR, so I didn't file in JIRA. Author: kj-ki <kikushima.ke...@lab.ntt.co.jp> Closes #3904 from kj-ki/fix-partitionstrategy-comments and squashes the following commits: 79829d9 [kj-ki] Fix comments for 2D partitioning. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5e3ec111 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5e3ec111 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5e3ec111 Branch: refs/heads/master Commit: 5e3ec1110495899a298313c4aa9c6c151c1f54da Parents: a6394bc Author: kj-ki <kikushima.ke...@lab.ntt.co.jp> Authored: Tue Jan 6 09:49:37 2015 -0800 Committer: Ankur Dave <ankurd...@gmail.com> Committed: Tue Jan 6 09:49:37 2015 -0800 ---------------------------------------------------------------------- .../main/scala/org/apache/spark/graphx/PartitionStrategy.scala | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/5e3ec111/graphx/src/main/scala/org/apache/spark/graphx/PartitionStrategy.scala ---------------------------------------------------------------------- diff --git a/graphx/src/main/scala/org/apache/spark/graphx/PartitionStrategy.scala b/graphx/src/main/scala/org/apache/spark/graphx/PartitionStrategy.scala index 13033fe..7372dfb 100644 --- a/graphx/src/main/scala/org/apache/spark/graphx/PartitionStrategy.scala +++ b/graphx/src/main/scala/org/apache/spark/graphx/PartitionStrategy.scala @@ -32,9 +32,9 @@ trait PartitionStrategy extends Serializable { object PartitionStrategy { /** * Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, - * guaranteeing a `2 * sqrt(numParts)` bound on vertex replication. + * guaranteeing a `2 * sqrt(numParts) - 1` bound on vertex replication. * - * Suppose we have a graph with 11 vertices that we want to partition + * Suppose we have a graph with 12 vertices that we want to partition * over 9 machines. We can use the following sparse matrix representation: * * <pre> @@ -61,7 +61,7 @@ object PartitionStrategy { * that edges adjacent to `v11` can only be in the first column of blocks `(P0, P3, * P6)` or the last * row of blocks `(P6, P7, P8)`. As a consequence we can guarantee that `v11` will need to be - * replicated to at most `2 * sqrt(numParts)` machines. + * replicated to at most `2 * sqrt(numParts) - 1` machines. * * Notice that `P0` has many edges and as a consequence this partitioning would lead to poor work * balance. To improve balance we first multiply each vertex id by a large prime to shuffle the --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org