[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-11-24 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-159197234 We have implemented a faster way by using zipPartition. But the final results are packaged in RDD. When data volumes are huge, it is much faster than it is now.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-11-22 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-158851260 @josephlijia this feature has moved into a Spark package. If you want to file an issue report it's best to do it here:

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-11-21 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-158647538 When we looked up one certain key-value by IndexedRDD, we found that it was even slower than ordinary RDD. We use 100, keys in our experiment. When we tested it

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-10-15 Thread tispratik
Github user tispratik commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-148489954 This is very interesting. Thanks for working on it. Hopefully it will be out soon. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-08-27 Thread zerosign
Github user zerosign commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-135373104 Hi Ankur, Any update on this pull request ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-07-14 Thread swethakasireddi
Github user swethakasireddi commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-121435467 Hi Ankur, Is this available in Spark 1.4.0 ? Also, can this be used in Spark Streaming for lookups/updates/deletes based on key instead of having to

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-26 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-115716387 When I want to update one value by one key using IndexedRDD, it only re-creates one LeafNode. It is the cost of updating. Is it right? --- If your project is set

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-26 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-115831041 @josephlijia For the old version of IndexedRDD (version 0.1), an update recreates one LeafNode, plus all InternalNodes up to the root. --- If your project is set up

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-24 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-114842646 I met a question when I did some testings based on IndexedRDD. I compared original RDD with IndexedRDD when looking up, updating, joining and deleting. However, I

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-24 Thread adamnovak
Github user adamnovak commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-114941993 I'm not sure this is the appropriate place to ask. Maybe make a new issue on the IndexedRDD repo? On Wed, Jun 24, 2015 at 4:52 AM, josephlijia

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-10 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-110605232 Well, I found that getting is slower than putting by using IndexedRDD. But getting should be faster than putting, is it right? I am expecting your reply. Thanks a

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-10 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-110610103 @josephlijia Right, getting should generally be faster than putting. However, for large batches of keys, multiget might be slower than multiput because it currently

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-10 Thread josephlijia
Github user josephlijia commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-110615752 Look at the code below: def multiget(ks: Array[Id]): Map[Id, V] = { val ksByPartition = ks.groupBy(k = self.partitioner.get.getPartition(k)) val

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-06-10 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-110616280 It does send all keys to all partitions, because `ksByPartition` is referenced in the closure passed to `context.runJob` and so is shipped in full to all partitions.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-03-13 Thread jason-dai
Github user jason-dai commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-79776523 @jegonzal I wonder if you can share more details on your stack overflow issue. We were considering a general fix (e.g., as I outlined in

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69504481 We should really address this stack overflow issue. Is there a JIRA we can promote? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread octavian-ganea
Github user octavian-ganea commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69505333 Writing the RDD to disk from time to time is not a solution for me. Also the second idea it's not good if I am doing random put and get ops. A common usecase is

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69521957 @jegonzal https://issues.apache.org/jira/browse/SPARK-4672 is relevant for specifically GraphX encountering the stack overflow and has extensive discussion, but I don't

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-11 Thread jegonzal
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69531109 Hmm, we really need to elevate this to a full issue. I have run into the stack overflow in MLlib (ALS) as well. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-10 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69475120 @octavian-ganea IndexedRDD creates a new lineage entry for each operation. This enables fault tolerance but, as with other iterative Spark programs, causes stack

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2015-01-09 Thread octavian-ganea
Github user octavian-ganea commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-69365816 Thanks for the nice work! I am trying to use this IndexedRDD as a distributed hash map and I would like to be able to insert and update many entries (tens of

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-12-23 Thread ankurdave
Github user ankurdave closed the pull request at: https://github.com/apache/spark/pull/1297 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-12-23 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-68017172 IndexedRDD is now part of Spark Packages, so I'm closing this PR and have moved it to a separate repository: https://github.com/amplab/spark-indexedrdd. The

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-12-23 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-68019141 @ankurdave Does this mean IndexedRDD will not become part of Spark Core, or is that still potentially happening in the near future? --- If your project is set up for

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-12-23 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-68019859 @nchammas I don't think that's going to happen in the near future since the interface and implementation are relatively unstable, but it could still happen eventually.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-11-21 Thread adamnovak
Github user adamnovak commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-6365 Can it be in Spark 1.3? This sort of functionality would really help us get a Spark-based implementation of the stuff that @ga4gh/global-alliance-committers is doing

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-11-12 Thread bobbych
Github user bobbych commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-62686554 Firstly Thanks for the work! i have one question?, does it support getPersistentRDDs ? use case is reusing cached rdd, something along line of spark job server

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-11-12 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-62687768 @bobbych IndexedRDD handles persistence by caching its partitionsRDD, which is the MapPartitionsRDD that you're getting back from sc.getPersistentRDDs. As far as I

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-11-12 Thread pwais
Github user pwais commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-62687253 Curious, will this ship in 1.2 ? (Also just want to ❤ for such a lovely PR) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-04 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-57905819 This looks really interesting. Is there a blocker for supporting generic keys (or at least say `String`), or is that a performance issue? --- If your project is set up

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-04 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-57917807 @MLnick It's a slight performance issue, since we currently use PrimitiveKeyOpenHashMap which optimizes for primitive keys by avoiding null tracking, but I think the

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-57693369 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21218/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-57703581 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-10-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-57703566 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21218/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56926924 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20845/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56932588 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56932585 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20845/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread markncooper
Github user markncooper commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56558807 Is it correct to assume that persist() is necessary otherwise the index will get recreated each time it's used? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56559155 @markncooper Yes, the IndexedRDD operations are implemented purely in terms of Spark transformations, so they will get recomputed each time the result is used unless

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread ankurdave
Github user ankurdave commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17945448 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDDPartitionLike.scala --- @@ -0,0 +1,426 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread ankurdave
Github user ankurdave commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17945543 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDDPartitionLike.scala --- @@ -0,0 +1,426 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56605365 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20734/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56610747 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20734/consoleFull) for PR 1297 at commit

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56610754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20734/

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17790219 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDDLike.scala --- @@ -0,0 +1,338 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r17791303 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ImmutableLongOpenHashSet.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56199798 This looks great! my comments are minor. I know its early to be discussing example docs, but I just wanted to mention that I can see caching being an area of

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-19 Thread markncooper
Github user markncooper commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-56236278 For what it's worth (and we are early on in our Spark usage) but we've kicked the tires on this IndexedRDD and we love it. Thanks Ankur. We'll report back with a

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-03 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-54382233 What's the status of this PR? Are we blocking on design review or Spark/GraphX roadmap discussions? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-09-03 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-54383337 We've had a design review; the summary was that this design is good, though we will eventually want to support alternative update mechanisms such as log-structured

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-14 Thread ankurdave
Github user ankurdave commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r14908709 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDDLike.scala --- @@ -0,0 +1,338 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48988645 QA tests have started for PR 1297. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16655/consoleFull ---

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-10 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r14799543 --- Diff: core/src/main/scala/org/apache/spark/rdd/IndexedRDD.scala --- @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48417786 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48417780 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48418341 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48418352 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48418901 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48418910 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48419500 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48420477 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48420478 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16435/ --- If your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48421014 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16436/ --- If your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48421013 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48144983 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48144986 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16361/ --- If your

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-06 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r14578164 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ImmutableVector.scala --- @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-06 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1297#discussion_r14578170 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ImmutableVector.scala --- @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-06 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48138509 @concretevitamin Thanks for the comments. I also found a way to simplify the design by unifying `IndexedRDD(Partition)Like` and `IndexedRDD(Partition)Ops` as you

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48140661 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48006656 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2365] Add IndexedRDD, an efficient upda...

2014-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1297#issuecomment-48006657 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16334/ --- If your