Yikf commented on PR #3471:
URL: 
https://github.com/apache/incubator-kyuubi/pull/3471#issuecomment-1247527482

   Hi @pan3793, I think there is regression on Spark 3.3;
   
   **The symptoms of CI failure are:**
   - A broadcastable node changed to a non-BroadcastQuerystageExec node after 
AQE, [Spark 
code](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L369)
 will thrown assertion exception
   
   **The precondition of the problem is known:**
   - The Kyuubi `OutputSchemaTPCDSSuite` verifies the output schema 
consistency. The table we used in the test is an empty set.
   - SubqueryBroadcastExec(which it derived from SubqueryAdaptiveBroadcastExec 
conversion) exist in the SQL, and it's child is a by the AdaptiveSparkPlanExec 
warp BroadcastExchangeExec node, The Exchange node is reusable.
   
   **The cause of the problem is:**
   - If execute SubqueryBroadcastExec, The calling link as follow:
   ```
   SubqueryBroadcastExec.executeCollect  -> child.executeBroadcast  -> 
AdaptiveSparkPlanExec.doExecuteBroadcast
   ```
   - Because BroadcastExchangeExec is a reusable exchange, It may be reused 
QueryStageExec after `createQueryStages`, and the 
QueryStageExec.allChildStagesMaterialized mybe is false (Its materialized 
result is true only when the reused QueryStageExec executes, see 
[code)](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala#L108-L109).
   - Therefore, the generated stage will be materialized, and after 
materialization, it will be optimized, including the corresponding logical 
execution plan, Because it is on the empty table query, runtime statistics 
rowCount is 0, so the `AQEPropagateEmptyRelation` optimize the logicalPlan 
LogicalQueryStage into LocalRelation, plan again, The physicalPlan changes to 
LocalTableScanExec instead of BroadcastQueryStageExec, assert failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to