Yikf commented on PR #3471: URL: https://github.com/apache/incubator-kyuubi/pull/3471#issuecomment-1247527482
Hi @pan3793, I think there is regression on Spark 3.3; **The symptoms of CI failure are:** - A broadcastable node changed to a non-BroadcastQuerystageExec node after AQE, [Spark code](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L369) will thrown assertion exception **The precondition of the problem is known:** - The Kyuubi `OutputSchemaTPCDSSuite` verifies the output schema consistency. The table we used in the test is an empty set. - SubqueryBroadcastExec(which it derived from SubqueryAdaptiveBroadcastExec conversion) exist in the SQL, and it's child is a by the AdaptiveSparkPlanExec warp BroadcastExchangeExec node, The Exchange node is reusable. **The cause of the problem is:** - If execute SubqueryBroadcastExec, The calling link as follow: ``` SubqueryBroadcastExec.executeCollect -> child.executeBroadcast -> AdaptiveSparkPlanExec.doExecuteBroadcast ``` - Because BroadcastExchangeExec is a reusable exchange, It may be reused QueryStageExec after `createQueryStages`, and the QueryStageExec.allChildStagesMaterialized mybe is false (Its materialized result is true only when the reused QueryStageExec executes, see [code)](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala#L108-L109). - Therefore, the generated stage will be materialized, and after materialization, it will be optimized, including the corresponding logical execution plan, Because it is on the empty table query, runtime statistics rowCount is 0, so the `AQEPropagateEmptyRelation` optimize the logicalPlan LogicalQueryStage into LocalRelation, plan again, The physicalPlan changes to LocalTableScanExec instead of BroadcastQueryStageExec, assert failed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
