Github user rednaxelafx commented on the issue:
https://github.com/apache/spark/pull/21039
Thanks for reverting it for me. The test failure was definitely related to
the explicit nulling from this PR, but I can't see how that's possible yet.
First of all, in the build that first introduced my change, build 4693,
this particular test was passing:
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/4693/testReport/junit/org.apache.spark.sql/TPCDSQuerySuite/q61/
The build that failed was the one immediately after that.
Second, the stack trace seen from the failure indicates that
`doProduceWithoutKeys` is indeed on the stack, and it was before the line that
I null'd out `bufVars`, so I can't see how `bufVars` can be null in
`doConsumeWithoutKeys`.
Stack trace:
```
java.lang.NullPointerException
at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doConsumeWithoutKeys(HashAggregateExec.scala:274)
at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doConsume(HashAggregateExec.scala:171)
at
org.apache.spark.sql.execution.CodegenSupport$class.constructDoConsumeFunction(WholeStageCodegenExec.scala:209)
at
org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:180)
at
org.apache.spark.sql.execution.ProjectExec.consume(basicPhysicalOperators.scala:35)
at
org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:65)
at
org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:182)
...
at
org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at
org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduceWithoutKeys(HashAggregateExec.scala:237)
at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduce(HashAggregateExec.scala:163)
```
The relevant line in `doConsumeWithoutKeys` is:
https://github.com/apache/spark/blob/75a183071c4ed2e407c930edfdf721779662b3ee/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L274
It's reading `bufVars`.
The relevant line in `doProduceWithoutKeys` is:
https://github.com/apache/spark/blob/75a183071c4ed2e407c930edfdf721779662b3ee/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L237
It's calling `child.produce()`, and that's before the nulling at line 241.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]