[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11327


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216658381
  
Merging this into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216647234
  
Test build #2965 has finished successfully so I'm going by that.  The other 
one had another unit test failure that is unrelated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216646855
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57653/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216646851
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216646622
  
**[Test build #57653 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57653/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216644889
  
**[Test build #2965 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2965/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216621319
  
**[Test build #57653 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57653/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216620528
  
test failure is in ExternalAppendOnlyMapSuite which is unrelated. I'll kick 
jenkins again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216620552
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216613204
  
**[Test build #2965 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2965/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216612214
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216612218
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57644/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216611966
  
**[Test build #57644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57644/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216603791
  
LGTM, pending tests. It's great to have 200X speedup, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216585173
  
**[Test build #57644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57644/consoleFull)**
 for PR 11327 at commit 
[`305a7db`](https://github.com/apache/spark/commit/305a7db60a2fc836035ed06c8207ce772c5e3b23).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-03 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216584419
  
thanks for the review, made changes and updated description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216358709
  
@tgravescs Could you also update the description of reflect the new changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61801165
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -334,8 +332,41 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partition in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+val partIter = partitionLocs.partsWithLocs.iterator
+while (partIter.hasNext && initialHash.size < groupArr.size) {
+  var (nxt_replica, nxt_part) = partIter.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.numPartitions == 0).map(firstEmpty => {
--- End diff --

same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61801132
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -334,8 +332,41 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partition in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+val partIter = partitionLocs.partsWithLocs.iterator
+while (partIter.hasNext && initialHash.size < groupArr.size) {
+  var (nxt_replica, nxt_part) = partIter.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.numPartitions == 0).map(firstEmpty => {
+  firstEmpty.partitions += nxt_part
+  initialHash += nxt_part
+})
+  }
+}
+  }
+
+  // if we didn't get one partitions per group from partitions with 
preferred locations
+  // use partitions without preferred locations
+  val partNoLocIter = partitionLocs.partsWithoutLocs.iterator
+  while (partNoLocIter.hasNext && initialHash.size < groupArr.size) {
+var nxt_part = partNoLocIter.next()
+if (!initialHash.contains(nxt_part)) {
+  groupArr.find(pg => pg.numPartitions == 0).map(firstEmpty => {
--- End diff --

This is still O(N*N) (the worst), it could be
```
groupArr.filter(pg => pg.numPartitions == 0).foreach { pg =>
  while (partNoLocIter && pg.numPartitions == 0) {
 if (!initialHash.contains(nxt_part)) {
   pg.partitions += nxt_part
 initialHash += nxt_part
 }
  }
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61798932
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -320,7 +317,8 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
 }
   }
 
-  def throwBalls(maxPartitions: Int, prev: RDD[_], balanceSlack: Double) {
+  def throwBalls(maxPartitions: Int, prev: RDD[_],
+  balanceSlack: Double, partitionLocs: PartitionLocations) {
--- End diff --

indents


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61798850
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -289,10 +284,12 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
* imbalance in favor of locality
* @return partition group (bin to be put in)
*/
-  def pickBin(p: Partition, prev: RDD[_], balanceSlack: Double): 
PartitionGroup = {
+  def pickBin(p: Partition, prev: RDD[_], balanceSlack: Double,
+  partitionLocs: PartitionLocations): PartitionGroup = {
--- End diff --

```
def pickBin(
p: Partition,
prev: RDD[_],
balanceSlack: Double,
partitionLocs: PartitionLocations): PartitionGroup = {
```
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216353266
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57556/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216353258
  
**[Test build #57556 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57556/consoleFull)**
 for PR 11327 at commit 
[`f012cd5`](https://github.com/apache/spark/commit/f012cd5fc20feb20088c808275cf283d0f594cec).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216353264
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216352965
  
**[Test build #57556 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57556/consoleFull)**
 for PR 11327 at commit 
[`f012cd5`](https://github.com/apache/spark/commit/f012cd5fc20feb20088c808275cf283d0f594cec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61798109
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -169,43 +169,41 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
 
   var noLocality = true  // if true if no preferredLocations exists for 
parent RDD
 
-  // gets the *current* preferred locations from the DAGScheduler (as 
opposed to the static ones)
-  def currPrefLocs(part: Partition, prev: RDD[_]): Seq[String] = {
-prev.context.getPreferredLocs(prev, part.index).map(tl => tl.host)
-  }
-
-  // this class just keeps iterating and rotating infinitely over the 
partitions of the RDD
-  // next() returns the next preferred machine that a partition is 
replicated on
-  // the rotator first goes through the first replica copy of each 
partition, then second, third
-  // the iterators return type is a tuple: (replicaString, partition)
-  class LocationIterator(prev: RDD[_]) extends Iterator[(String, 
Partition)] {
-
-var it: Iterator[(String, Partition)] = resetIterator()
-
-override val isEmpty = !it.hasNext
-
-// initializes/resets to start iterating from the beginning
-def resetIterator(): Iterator[(String, Partition)] = {
-  val iterators = (0 to 2).map { x =>
-prev.partitions.iterator.flatMap { p =>
-  if (currPrefLocs(p, prev).size > x) Some((currPrefLocs(p, 
prev)(x), p)) else None
+  class PartitionLocations(prev: RDD[_]) {
+
+// contains all the partitions from the previous RDD that don't have 
preferred locations
+val partsWithoutLocs = ArrayBuffer[Partition]()
+// contains all the partitions from the previous RDD that have 
preferred locations
+val partsWithLocs: Array[(String, Partition)] = getAllPrefLocs(prev)
+
+// has side affect of filling in partitions without locations as well
+def getAllPrefLocs(prev: RDD[_]): Array[(String, Partition)] = {
+  val partsWithLocs = mutable.LinkedHashMap[Partition, Seq[String]]()
+  // first get the locations for each partition, only do this once 
since it can be expensive
+  prev.partitions.foreach(p => {
+  val locs = currPrefLocs(p, prev)
+  if (locs.size > 0) {
+partsWithLocs.put(p, locs)
+  } else {
+partsWithoutLocs += p
+  }
 }
-  }
-  iterators.reduceLeft((x, y) => x ++ y)
+  )
+  // convert it into an array of host to partition
+  val allLocs = (0 to 2).map(x =>
+partsWithLocs.toArray.flatMap(parts => {
+  val p = parts._1
+  val locs = parts._2
+  if (locs.size > x) Some((locs(x), p)) else None
+} )
+  )
+  allLocs.reduceLeft((x, y) => x ++ y)
 }
+  }
 
-// hasNext() is false iff there are no preferredLocations for any of 
the partitions of the RDD
-override def hasNext: Boolean = { !isEmpty }
-
-// return the next preferredLocation of some partition of the RDD
-override def next(): (String, Partition) = {
-  if (it.hasNext) {
-it.next()
-  } else {
-it = resetIterator() // ran out of preferred locations, reset and 
rotate to the beginning
-it.next()
-  }
-}
+  // gets the *current* preferred locations from the DAGScheduler (as 
opposed to the static ones)
+  def currPrefLocs(part: Partition, prev: RDD[_]): Seq[String] = {
--- End diff --

private or inline this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r61797836
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -169,43 +169,41 @@ private class DefaultPartitionCoalescer(val 
balanceSlack: Double = 0.10)
 
   var noLocality = true  // if true if no preferredLocations exists for 
parent RDD
 
-  // gets the *current* preferred locations from the DAGScheduler (as 
opposed to the static ones)
-  def currPrefLocs(part: Partition, prev: RDD[_]): Seq[String] = {
-prev.context.getPreferredLocs(prev, part.index).map(tl => tl.host)
-  }
-
-  // this class just keeps iterating and rotating infinitely over the 
partitions of the RDD
-  // next() returns the next preferred machine that a partition is 
replicated on
-  // the rotator first goes through the first replica copy of each 
partition, then second, third
-  // the iterators return type is a tuple: (replicaString, partition)
-  class LocationIterator(prev: RDD[_]) extends Iterator[(String, 
Partition)] {
-
-var it: Iterator[(String, Partition)] = resetIterator()
-
-override val isEmpty = !it.hasNext
-
-// initializes/resets to start iterating from the beginning
-def resetIterator(): Iterator[(String, Partition)] = {
-  val iterators = (0 to 2).map { x =>
-prev.partitions.iterator.flatMap { p =>
-  if (currPrefLocs(p, prev).size > x) Some((currPrefLocs(p, 
prev)(x), p)) else None
+  class PartitionLocations(prev: RDD[_]) {
+
+// contains all the partitions from the previous RDD that don't have 
preferred locations
+val partsWithoutLocs = ArrayBuffer[Partition]()
+// contains all the partitions from the previous RDD that have 
preferred locations
+val partsWithLocs: Array[(String, Partition)] = getAllPrefLocs(prev)
+
+// has side affect of filling in partitions without locations as well
+def getAllPrefLocs(prev: RDD[_]): Array[(String, Partition)] = {
+  val partsWithLocs = mutable.LinkedHashMap[Partition, Seq[String]]()
+  // first get the locations for each partition, only do this once 
since it can be expensive
+  prev.partitions.foreach(p => {
+  val locs = currPrefLocs(p, prev)
+  if (locs.size > 0) {
+partsWithLocs.put(p, locs)
+  } else {
+partsWithoutLocs += p
+  }
 }
-  }
-  iterators.reduceLeft((x, y) => x ++ y)
+  )
+  // convert it into an array of host to partition
+  val allLocs = (0 to 2).map(x =>
+partsWithLocs.toArray.flatMap(parts => {
+  val p = parts._1
+  val locs = parts._2
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

We may just append (locs(x), p) to an ArrayBuffer


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216350597
  
@tgravescs That's great, could you fix the style?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216339716
  
**[Test build #57550 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57550/consoleFull)**
 for PR 11327 at commit 
[`2eff583`](https://github.com/apache/spark/commit/2eff583d896b1032477a299aa9ae488711d5f01c).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class PartitionLocations(prev: RDD[_]) `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216339722
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57550/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216339720
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216339359
  
Ok, I replaced the location iterator and now get all the preferred 
locations up front.  This made the run time of the this go from around a minute 
down to around 6 seconds. 

I kept most of the logic the same and just changed how its getting the 
locations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-216339363
  
**[Test build #57550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57550/consoleFull)**
 for PR 11327 at commit 
[`2eff583`](https://github.com/apache/spark/commit/2eff583d896b1032477a299aa9ae488711d5f01c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-200963748
  
I think the current implementation does not handle location changing, and 
we can't.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-200961327
  
thanks for the feedback.  I'm fine with that change and actually had 
considered it, I just wasn't sure if the intention of the location iterator was 
to handle the locations changing (in some case I'm not aware of) so I was going 
the less invasive method of leaving that part the same.

If we aren't aware of any issues with that I'll make the changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-200959288
  
@tgravescs The current change is good for your case, I'm thinking that 
maybe we could do better.

The LocationIterator has a bad smell, it may call getPreferredLocs() many 
times on the same partition, which could be expensive as you mentioned, we 
should only call getPreferredLocs() on a partition once, by caching the result 
of it, or get rid of LocationIterator totally.

Since we will call getPreferredLocs on every partition of previous RDD, we 
could eaglely call them in the beginning, partition all the partitions into two 
groups: with prefered location and without, do different things on them. This 
approach should solve all the cases that 1) all partition have locs 2) some 
partitions have locs 3) none of partiions have locs.

Hopefully, this could be done in less than 10 seconds in your case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r57363138
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -324,6 +319,40 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partitions in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+var tries = 0
+val rotIt = new LocationIterator(prev)
+while (tries < prev.partitions.length && initialHash.size < 
groupArr.size) {
+  // if the number of partitions with preferred locations is less 
then
+  // number of total partitions this might loop over some more 
then once but we need to
+  // handle both cases and its not easy to get # of partitions 
with preferred locs
+  var (nxt_replica, nxt_part) = rotIt.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.size == 0).map(firstEmpty => {
+  firstEmpty.arr += nxt_part
+  initialHash += nxt_part
+})
+  }
+  tries += 1
+}
+  }
+  // we have went through all with preferred locations now just make 
sure one
+  // partition per group
+  val numEmptyPartitionGroups = groupArr.length - getPartitions.length
--- End diff --

i split it this way because one is called setupGroups and the throwballs 
one is put stuff in the groups. I see the fact we put stuff in them in 
setupgroup as an optimization rather then a necessity. 

Do you see a benefit to do it there vs here? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r57361560
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -324,6 +319,40 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partitions in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+var tries = 0
+val rotIt = new LocationIterator(prev)
+while (tries < prev.partitions.length && initialHash.size < 
groupArr.size) {
+  // if the number of partitions with preferred locations is less 
then
+  // number of total partitions this might loop over some more 
then once but we need to
+  // handle both cases and its not easy to get # of partitions 
with preferred locs
+  var (nxt_replica, nxt_part) = rotIt.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.size == 0).map(firstEmpty => {
--- End diff --

This is also O(N*N)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r57361348
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -324,6 +319,40 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partitions in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+var tries = 0
+val rotIt = new LocationIterator(prev)
+while (tries < prev.partitions.length && initialHash.size < 
groupArr.size) {
+  // if the number of partitions with preferred locations is less 
then
+  // number of total partitions this might loop over some more 
then once but we need to
+  // handle both cases and its not easy to get # of partitions 
with preferred locs
+  var (nxt_replica, nxt_part) = rotIt.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.size == 0).map(firstEmpty => {
+  firstEmpty.arr += nxt_part
+  initialHash += nxt_part
+})
+  }
+  tries += 1
+}
+  }
+  // we have went through all with preferred locations now just make 
sure one
+  // partition per group
+  val numEmptyPartitionGroups = groupArr.length - getPartitions.length
--- End diff --

Should we move this into setupGroups(), so we still meet the assumption 
that all the groups have one partition?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r57361377
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -324,6 +319,40 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 }
   }
 } else {
+  // It is possible to have unionRDD where one rdd has preferred 
locations and another rdd
+  // that doesn't. To make sure we end up with the requested number of 
partitions,
+  // make sure to put a partitions in every group.
+
+  if (groupArr.size > initialHash.size) {
+// we don't have a partition assigned to every group yet so first 
try to fill them
+// with the partitions with preferred locations
+var tries = 0
+val rotIt = new LocationIterator(prev)
+while (tries < prev.partitions.length && initialHash.size < 
groupArr.size) {
+  // if the number of partitions with preferred locations is less 
then
+  // number of total partitions this might loop over some more 
then once but we need to
+  // handle both cases and its not easy to get # of partitions 
with preferred locs
+  var (nxt_replica, nxt_part) = rotIt.next()
+  if (!initialHash.contains(nxt_part)) {
+groupArr.find(pg => pg.size == 0).map(firstEmpty => {
+  firstEmpty.arr += nxt_part
+  initialHash += nxt_part
+})
+  }
+  tries += 1
+}
+  }
+  // we have went through all with preferred locations now just make 
sure one
+  // partition per group
+  val numEmptyPartitionGroups = groupArr.length - getPartitions.length
+  val partitionsNotInGroups = prev.partitions.filter(p => 
!initialHash.contains(p))
+  for (i <- 0 until math.min(numEmptyPartitionGroups, 
partitionsNotInGroups.length)) {
+groupArr.find(pg => pg.size == 0).map(firstEmpty => {
--- End diff --

This is still O(N*N)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-200929309
  
@tgravescs I'm reviewing this now, sorry for the late.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-24 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-200904458
  
ping @davies @rxin   Any chance I can get review on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-07 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-193340890
  
@tgravescs I did not have enough time to look into the details yet (not 
familar this part), sorry for the delay.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-03-07 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-193293645
  
ping @davies  was there any other concern or does this look good?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-29 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54446793
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

I was just responding to @davies' comment. Not saying anything wrong with 
this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-29 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-190241612
  
Are there any other comments about functionality?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-29 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54418281
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

@rxin I don't follow your first sentence? 

Second sentence says use size for a collection and Seq is a collection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-28 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54375032
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

It's not size vs length. It's seq.size can sometimes be O(n).

For size vs length, we should use length if it is a string or an array, but 
size if it is a collection.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-25 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54115176
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

Are you sure on this?  Overall I'm fine with changing but just want to 
understand for future.

locs here is a Seq

From scaladoc:
The size of this sequence, equivalent to length

scala doc says length on sequence also says:
Note: will not terminate for infinite-sized collections.

Looking at the scala source code:

https://github.com/scala/scala/blob/2.10.x/src/library/scala/collection/SeqLike.scala#L106






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-25 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54116119
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-24 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11327#discussion_r54019069
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ---
@@ -192,7 +192,8 @@ private class PartitionCoalescer(maxPartitions: Int, 
prev: RDD[_], balanceSlack:
 def resetIterator(): Iterator[(String, Partition)] = {
   val iterators = (0 to 2).map( x =>
 prev.partitions.iterator.flatMap(p => {
-  if (currPrefLocs(p).size > x) Some((currPrefLocs(p)(x), p)) else 
None
+  val locs = currPrefLocs(p)
+  if (locs.size > x) Some((locs(x), p)) else None
--- End diff --

locs.size -> locs.length

`size` could be O(N), while `length` is O(1)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-24 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-188142561
  
cc @davies


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187969886
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187969889
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51799/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187969631
  
**[Test build #51799 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51799/consoleFull)**
 for PR 11327 at commit 
[`8665114`](https://github.com/apache/spark/commit/86651146e90d5126d379a4abc6d73a8c6b7a50df).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187921218
  
**[Test build #51799 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51799/consoleFull)**
 for PR 11327 at commit 
[`8665114`](https://github.com/apache/spark/commit/86651146e90d5126d379a4abc6d73a8c6b7a50df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187891776
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187891779
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51789/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187891764
  
**[Test build #51789 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51789/consoleFull)**
 for PR 11327 at commit 
[`c9eb032`](https://github.com/apache/spark/commit/c9eb032af8e453a5ba6776279cf0cd6946d0cd55).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187891331
  
**[Test build #51789 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51789/consoleFull)**
 for PR 11327 at commit 
[`c9eb032`](https://github.com/apache/spark/commit/c9eb032af8e453a5ba6776279cf0cd6946d0cd55).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187885309
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187883014
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187883016
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51787/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187875082
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187875090
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51786/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187875072
  
**[Test build #51786 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51786/consoleFull)**
 for PR 11327 at commit 
[`afe14dc`](https://github.com/apache/spark/commit/afe14dce508b1e51820f16e33f09c9aa402bca3e).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187874407
  
**[Test build #51786 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51786/consoleFull)**
 for PR 11327 at commit 
[`afe14dc`](https://github.com/apache/spark/commit/afe14dce508b1e51820f16e33f09c9aa402bca3e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187858375
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187851796
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11316] coalesce doesn't handle UnionRDD...

2016-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11327#issuecomment-187851798
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51784/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org