[GitHub] spark pull request: [SPARK-3650][GraphX] Triangle Count handles re...

2016-02-21 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/11290#issuecomment-186951853 This looks good to me. @insidedctm thanks for reviving the PR and @srowen thanks for taking a look at this! My only minor concern is that it will change the results

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-04 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168628357 @davies and @JoshRosen I have finished a working prototype that passes the tests. I would be interested in your thoughts. --- If your project is set up

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-04 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168850430 @davies thanks for taking look! I will open a JIRA issue later today. With respect to the disk based design, I had considered it but it has a few limitations

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-04 Thread jegonzal
Github user jegonzal closed the pull request at: https://github.com/apache/spark/pull/10550 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-03 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168568011 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-03 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168566477 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-03 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168497312 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-01 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168368506 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-01 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168362426 @davies and @JoshRosen let me know what you think of this design. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: Adding zipPartitions to PySpark

2016-01-01 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/10550 Adding zipPartitions to PySpark The following working WIP adds support for `zipPartitions` to PySpark. This is accomplished by modifying the PySpark `worker` (in both daemon and non-deamon mode

[GitHub] spark pull request: [SPARK-11432][GraphX] Personalized PageRank sh...

2015-11-02 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/9386#issuecomment-153207973 This is actually a pretty serious error since it could lead to mass being accumulated on unreachable sub-graphs. The performance implications of the above branch

[GitHub] spark pull request: [SPARK-4086][GraphX]: Fold-style aggregation f...

2015-09-02 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/5142#issuecomment-137170613 @srowen GraphX is still active we have just been pretty busy with some other changes. Let me see what needs to be done with this PR. --- If your project is set up

[GitHub] spark pull request: [SPARK-9001] Fixing errors in javadocs that le...

2015-07-13 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/7354#issuecomment-120998075 I will make the suggested changes now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-9001] Fixing errors in javadocs that le...

2015-07-13 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/7354#issuecomment-121000162 I have merged upstream changes and added back the requested paragraph blocks (correctly). --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-9001] Fixing errors in javadocs that le...

2015-07-12 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/7354#discussion_r34419761 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -25,9 +25,9 @@ import static

[GitHub] spark pull request: [SPARK-9001] Fixing errors in javadocs that le...

2015-07-11 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/7354 [SPARK-9001] Fixing errors in javadocs that lead to failed build/sbt doc These are minor corrections in the documentation of several classes that are preventing: ```bash build/sbt

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2015-05-18 Thread jegonzal
Github user jegonzal closed the pull request at: https://github.com/apache/spark/pull/1228 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2015-05-18 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-103186580 I think we have covered most of this code in later tests (PR #1217) and the remaining tests need to be substantially updated which I can do in a later PR. I am going

[GitHub] spark pull request: Spark-5854 personalized page rank

2015-05-01 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/4774#discussion_r29521247 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/PageRank.scala --- @@ -103,8 +132,14 @@ object PageRank extends Logging

[GitHub] spark pull request: Spark-5854 personalized page rank

2015-05-01 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/4774#issuecomment-98196331 Overall this looks great! I apologize for the delayed response. I am going to go ahead and merge this now and then we can tune the performance in a later pull

[GitHub] spark pull request: [SPARK-3376] Add in-memory shuffle option.

2015-04-30 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/5403#issuecomment-97975070 This PR could have important performance implications for algorithms in GraphX and MLlib (e.g., ALS) which introduce relatively lightweight shuffle stages at each

[GitHub] spark pull request: [GraphX] initialmessage for pagerank should be...

2015-02-22 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1128#issuecomment-75451894 Great! I agree with this proposal as well. I apologize for letting it sit so long. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-3650] Fix TriangleCount handling of rev...

2015-01-21 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2495#issuecomment-70925718 Great! What else needs to be done? There was some discussion about how this might change the semantics of the triangle count function? Is this still true

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69504481 We should really address this stack overflow issue. Is there a JIRA we can promote? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69531109 Hmm, we really need to elevate this to a full issue. I have run into the stack overflow in MLlib (ALS) as well. --- If your project is set up for it, you can reply

[GitHub] spark pull request: Removing confusing TripletFields

2014-11-25 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/3472 Removing confusing TripletFields After additional discussion with @rxin, I think having all the possible `TripletField` options is confusing. This pull request reduces the triplet fields

[GitHub] spark pull request: Removing confusing TripletFields

2014-11-25 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3472#issuecomment-64520673 @ankurdave, what do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Removing confusing TripletFields

2014-11-25 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3472#issuecomment-64520711 This is consistent with the current discussion in the graphx programming guide and so it is unlikely users have started using the more obscure combinations that were

[GitHub] spark pull request: Updating GraphX programming guide and document...

2014-11-19 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3359#issuecomment-63743607 Sounds good. I can fix it now if you want. Joey --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: Updating GraphX programming guide and document...

2014-11-18 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3359#issuecomment-63597028 @rxin and @ankurdave, take a look when you get a chance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2014-11-18 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-63597173 @ankurdave and @rxin can we merge this now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-11-18 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-63597212 @ankurdave should I try and update this with your latest changes or do you want to create a new one? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: Updating GraphX programming guide and document...

2014-11-18 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/3359#discussion_r20559596 --- Diff: project/SparkBuild.scala --- @@ -328,7 +328,7 @@ object Unidoc { unidocProjectFilter in(ScalaUnidoc, unidoc) := inAnyProject

[GitHub] spark pull request: Drop VD type parameter from EdgeRDD

2014-11-16 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3303#issuecomment-63267156 Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-11-13 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-62917449 @ankurdave is this already covered in your latest PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3650] Fix TriangleCount handling of rev...

2014-11-13 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2495#issuecomment-62917547 @ankurdave take a look at this when you get a chance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3936] Add aggregateMessages, which supe...

2014-11-12 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/3100#discussion_r20243545 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/TripletFields.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-3936] Add aggregateMessages, which supe...

2014-11-12 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/3100#discussion_r20257677 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/TripletFields.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-3936] Remove Bytecode Inspection for Jo...

2014-11-12 Thread jegonzal
Github user jegonzal closed the pull request at: https://github.com/apache/spark/pull/2815 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP][SPARK-3530][MLLIB] pipeline and paramete...

2014-11-05 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3099#issuecomment-61922586 The model serving work would really benefit from being able to evaluate models without requiring a Spark context especially since we are shooting for 10s millisecond

[GitHub] spark pull request: [WIP][SPARK-3530][MLLIB] pipeline and paramete...

2014-11-05 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/3099#issuecomment-61934417 @jkbradley Right now we are planning to serve linear combinations of models derived from MLlib (currently latent factor models, naive bayes, and decision trees

[GitHub] spark pull request: [SPARK-3936] Remove Bytecode Inspection for Jo...

2014-10-31 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2815#issuecomment-61349221 I added the `TripletFields` enum and updated all the dependent files. I can't deprecate the old API since they have the same function signature up to default arguments

[GitHub] spark pull request: [SPARK-3936] Remove Bytecode Inspection for Jo...

2014-10-31 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2815#issuecomment-61349425 At this point I could also imagine actually having a separate function closure for each version. ```scala mapTriplets(f: Edge = ED2) mapTriplets(f

[GitHub] spark pull request: [SPARK-4130][MLlib] Fixing libSVM parser bug w...

2014-10-29 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/2996 [SPARK-4130][MLlib] Fixing libSVM parser bug with extra whitespace This simple patch filters out extra whitespace entries. You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-4130][MLlib] Fixing libSVM parser bug w...

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2996#issuecomment-61026000 Not sure why it failed the test. Is this an issue with the testing framework? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4130][MLlib] Fixing libSVM parser bug w...

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2996#issuecomment-61026298 The following implementation seems a bit more efficient but is needlessly complicated. ```scala // Count the number of empty values

[GitHub] spark pull request: [SPARK-4142][GraphX] Default numEdgePartitions

2014-10-29 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/3006 [SPARK-4142][GraphX] Default numEdgePartitions Changing the default number of edge partitions to match spark parallelism. You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-3936] Remove Bytecode Inspection for Jo...

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2815#issuecomment-61026843 What is the status on this patch? I would like to merge it soon so that the python GraphX API can support these additional flags. --- If your project is set up

[GitHub] spark pull request: [SPARK-3650] Fix TriangleCount handling of rev...

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2495#issuecomment-61026881 What is the status on this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-61027554 This is still work in progress and we need to discuss these API changes. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-61029490 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2014-10-29 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-61029482 This should now be addressed in the latest master and does not depend on PR #1217 --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Remove Bytecode Inspection for Join Eliminatio...

2014-10-15 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/2815 Remove Bytecode Inspection for Join Elimination Removing bytecode inspection from triplet operations and introducing explicit join elimination flags. The explicit flags make the join elimination

[GitHub] spark pull request: Remove Bytecode Inspection for Join Eliminatio...

2014-10-15 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2815#issuecomment-59263992 @ankurdave and @rxin I have not updated the applications to use the new explicit flags. I will do that in this PR pending approval for the API changes. --- If your

[GitHub] spark pull request: [SPARK-3936] Remove Bytecode Inspection for Jo...

2014-10-15 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2815#issuecomment-59275910 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3578] Fix upper bound in GraphGenerator...

2014-09-22 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2439#issuecomment-56438311 This looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3650] Fix TriangleCount handling of rev...

2014-09-22 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/2495 [SPARK-3650] Fix TriangleCount handling of reverse edges This PR causes the TriangleCount algorithm to remove self-edges, direct edges from low-id to high-id (canonical direction), and then remove

[GitHub] spark pull request: [SPARK-3263][GraphX] Fix changes made to Graph...

2014-08-28 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/2168#issuecomment-53760885 The code changes look good to me (and were badly need). Thanks for fixing it! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2014-08-28 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-53776044 Yes. This is an extension of the unit tests to catch a class of bugs addressed in PR #1217 (which has not been merged). I believe @ankurdave was working on a merge

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-26 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47193665 I spent some time verifying the math behind the PageRank (in particular starting values) to ensure that the delta formulation behaves identically to the static

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-26 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/1217#discussion_r14227560 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala --- @@ -158,4 +169,125 @@ object Pregel extends Logging { g } // end

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-26 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/1217#discussion_r14227573 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala --- @@ -158,4 +169,125 @@ object Pregel extends Logging { g } // end

[GitHub] spark pull request: Improved GraphX PageRank Test Coverage

2014-06-26 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47200276 @ankurdave thanks for pointing out this bug! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-26 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47204112 @ankurdave and @rxin there is an issue with the current API. The `sendMessage` function pull the active field out of the vertex value here: https://github.com

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-25 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/1217 Introducing an Improved Pregel API The initial Pregel API coupled voting to halt with message reception. In this revised the vertex program receives a `PregelContext` which enables the user

[GitHub] spark pull request: Introducing an Improved Pregel API

2014-06-25 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47167314 @ankurdave unfortunately to full accept this change we will need to break compatibility with the current Pregel API. I cannot seem to overload the apply method

[GitHub] spark pull request: Synthetic GraphX Benchmark

2014-05-16 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/720#issuecomment-42757213 Good point! I moved the benchmark into the examples folder. Is there a standard format for command line args in the example applications? --- If your project is set

[GitHub] spark pull request: Synthetic GraphX Benchmark

2014-05-16 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/720 Synthetic GraphX Benchmark This PR accomplishes two things: 1. It introduces a Synthetic Benchmark application that generates an arbitrarily large log-normal graph and executes either

[GitHub] spark pull request: Enable repartitioning of graph over different ...

2014-05-15 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/719 Enable repartitioning of graph over different number of partitions It is currently very difficult to repartition a graph over a different number of partitions. This PR adds an additional

[GitHub] spark pull request: Unify GraphImpl RDDs + other graph load optimi...

2014-05-14 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/497#issuecomment-42618546 I went through this PR with Ankur and it looks good to me. There are a few minor changes but those can be moved to a second PR. --- If your project is set up

[GitHub] spark pull request: SPARK-1786: Reopening PR 724

2014-05-12 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/742 SPARK-1786: Reopening PR 724 Addressing issue in MimaBuild.scala. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jegonzal/spark

[GitHub] spark pull request: SPARK-1786: Reopening PR 724

2014-05-12 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/742#issuecomment-42868913 @ankurdave and @pwendell I am reopening the PR 724 to address the issue with MimaBuild. I believe I made the required changes but how can I verify? --- If your project

[GitHub] spark pull request: SPARK-1786: Edge Partition Serialization

2014-05-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/724#issuecomment-42787343 I would like to get it into 1.0 if possible. Otherwise, we could run into issues if the user persists graphs to disk or straggler mitigation is used. @ankurdave do you

[GitHub] spark pull request: Fix error in 2d Graph Partitioner

2014-05-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/709#issuecomment-42703154 @rxin and @ankurdave take a look at this minor change when you get a chance. I would like to get it into the next release if possible. --- If your project is set up

[GitHub] spark pull request: SPARK-1786: Edge Partition Serialization

2014-05-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/724#issuecomment-42793347 My only concern is that I would prefer things work slowly than fail. With reference tracking disabled it is not possible to serialize user defined types from the spark

[GitHub] spark pull request: SPARK-1577: Enabling reference tracking by def...

2014-04-23 Thread jegonzal
GitHub user jegonzal opened a pull request: https://github.com/apache/spark/pull/499 SPARK-1577: Enabling reference tracking by default in GraphX KryoRegistrator. We had originally disabled reference tracking by default however this now seems to create serious issues in the spark

[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-04-23 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/10#discussion_r11915905 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/ShortestPaths.scala --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-04-23 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/10#discussion_r11916394 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/ShortestPaths.scala --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-04-23 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/10#discussion_r11916531 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/ShortestPaths.scala --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-04-23 Thread jegonzal
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/10#discussion_r11916619 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/lib/ShortestPaths.scala --- @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-04-23 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10#issuecomment-41199189 This code looks good to me. All my comments are with respect to potential performance issues. --- If your project is set up for it, you can reply to this email and have