[GitHub] spark issue #21039: [SPARK-23960][SQL][MINOR] Mark HashAggregateExec.bufVars...

rednaxelafx Wed, 11 Apr 2018 17:26:42 -0700

Github user rednaxelafx commented on the issue:

    https://github.com/apache/spark/pull/21039
  
    Thanks for reverting it for me. The test failure was definitely related to 
the explicit nulling from this PR, but I can't see how that's possible yet.
    
    First of all, in the build that first introduced my change, build 4693, 
this particular test was passing:
    
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/4693/testReport/junit/org.apache.spark.sql/TPCDSQuerySuite/q61/
    
    The build that failed was the one immediately after that.
    
    Second, the stack trace seen from the failure indicates that 
`doProduceWithoutKeys` is indeed on the stack,  and it was before the line that 
I null'd out `bufVars`, so I can't see how `bufVars` can be null in 
`doConsumeWithoutKeys`.
    
    Stack trace:
    ```
    java.lang.NullPointerException
          at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doConsumeWithoutKeys(HashAggregateExec.scala:274)
          at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doConsume(HashAggregateExec.scala:171)
          at 
org.apache.spark.sql.execution.CodegenSupport$class.constructDoConsumeFunction(WholeStageCodegenExec.scala:209)
          at 
org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:180)
          at 
org.apache.spark.sql.execution.ProjectExec.consume(basicPhysicalOperators.scala:35)
          at 
org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:65)
          at 
org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:182)
    ...
          at 
org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
          at 
org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
          at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduceWithoutKeys(HashAggregateExec.scala:237)
          at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduce(HashAggregateExec.scala:163)
    ```
    
    The relevant line in `doConsumeWithoutKeys` is:
    
https://github.com/apache/spark/blob/75a183071c4ed2e407c930edfdf721779662b3ee/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L274
    It's reading `bufVars`.
    
    The relevant line in `doProduceWithoutKeys` is:
    
https://github.com/apache/spark/blob/75a183071c4ed2e407c930edfdf721779662b3ee/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L237
    It's calling `child.produce()`, and that's before the nulling at line 241.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21039: [SPARK-23960][SQL][MINOR] Mark HashAggregateExec.bufVars...

Reply via email to