[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-24 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510621#comment-14510621
 ] 

Vinod KC commented on SPARK-7103:
-

I closed PR #5678..
thanks

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Minor

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-24 Thread Patrick Wendell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511689#comment-14511689
 ] 

Patrick Wendell commented on SPARK-7103:


Escalated the priority since IMO this is good to fix.

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Critical

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510507#comment-14510507
 ] 

Apache Spark commented on SPARK-7103:
-

User 'stevencanopy' has created a pull request for this issue:
https://github.com/apache/spark/pull/5679

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Minor

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-24 Thread Steven She (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510508#comment-14510508
 ] 

Steven She commented on SPARK-7103:
---

Submitted PR #5679 with an added condition to SparkContext.union and a 
precondition check to PartitionerAwareUnionRDD.

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Minor

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-23 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509939#comment-14509939
 ] 

Sean Owen commented on SPARK-7103:
--

Looks like the check needs to be expanded. To this:

{code}
require(rdds.flatMap(_.partitioner).toSet.size == 1
{code}

add before:

{code}
require(rdds.count(_.partitioner.isDefined) == rdds.size)
{code}

I think. Want to try a PR?

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-23 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510477#comment-14510477
 ] 

Apache Spark commented on SPARK-7103:
-

User 'vinodkc' has created a pull request for this issue:
https://github.com/apache/spark/pull/5678

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Minor

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org