[GitHub] spark pull request: [SPARK-4485][SQL] (1) Add broadcast hash outer...

JoshRosen Mon, 06 Jul 2015 16:50:33 -0700

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7162#discussion_r33995787
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkPlanTest.scala ---
    @@ -92,27 +153,25 @@ object SparkPlanTest {
        * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
        */
       def checkAnswer(
    -      input: DataFrame,
    -      planFunction: SparkPlan => SparkPlan,
    +      input: Seq[DataFrame],
    +      planFunction: Seq[SparkPlan] => SparkPlan,
           expectedAnswer: Seq[Row]): Option[String] = {
     
    -    val outputPlan = planFunction(input.queryExecution.sparkPlan)
    +    val outputPlan = planFunction(input.map(_.queryExecution.sparkPlan))
     
         // A very simple resolver to make writing tests easier. In contrast to 
the real resolver
         // this is always case sensitive and does not try to handle scoping or 
complex type resolution.
    -    val resolvedPlan = outputPlan transform {
    -      case plan: SparkPlan =>
    -        val inputMap = plan.children.flatMap(_.output).zipWithIndex.map {
    -          case (a, i) =>
    -            (a.name, BoundReference(i, a.dataType, a.nullable))
    --- End diff --
    
    After changing this new code to continue to generate BoundReferences, the 
test in this patch fails:
    
    ```
    [info]    == Exception ==
    [info]    org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 1888.0 failed 1 times, most recent failure: Lost task 
0.0 in stage 1888.0 (TID 6596, localhost): 
java.lang.ArrayIndexOutOfBoundsException: 2
    [info]      at 
org.apache.spark.sql.catalyst.expressions.ArrayBackedRow$class.apply(rows.scala:88)
    [info]      at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.apply(rows.scala:144)
    [info]      at org.apache.spark.sql.Row$class.isNullAt(Row.scala:182)
    [info]      at 
org.apache.spark.sql.catalyst.InternalRow.isNullAt(InternalRow.scala:28)
    [info]      at SC$SpecificProjection.apply(Unknown Source)
    [info]      at 
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1$$anonfun$6$$anonfun$apply$3.apply(Exchange.scala:166)
    [info]      at 
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1$$anonfun$6$$anonfun$apply$3.apply(Exchange.scala:166)
    [info]      at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    [info]      at 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:119)
    [info]      at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
    [info]      at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    [info]      at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    [info]      at org.apache.spark.scheduler.Task.run(Task.scala:70)
    [info]      at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    [info]      at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    [info]      at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    [info]      at java.lang.Thread.run(Thread.java:745)
    ```
    
    /cc @marmbrus



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4485][SQL] (1) Add broadcast hash outer...

Reply via email to