maryannxue commented on issue #25308: [SPARK-28576][SQL] fix the dead lock issue when enable new adaptive execution URL: https://github.com/apache/spark/pull/25308#issuecomment-517893572 After an offline discussion with @cloud-fan , we found out that the reason why the stats did not get updated correctly to the subquery was that the changing plan of subquery could not be in sync with the metrics collection. It is like: 1. When the subquery is first submitted, the initial plan of the subquery is posted to UI. 2. As it is being executed, the exchange nodes within the subquery change too, because of the "copy" operation when creating a new stage. 3. The UI discards the metrics for the new Exchange node coz it doesn't "recognize it" yet but meanwhile keeps collecting metrics for the old Exchange node, which now does not exist in the new plan of the subquery. 4. The main query proceeds on to new stages or the final plan, updating the UI again, and now that subquery is updated with the new Exchange node, which makes the plan itself right, but not the metrics. To solve this, @cloud-fan proposed that we could add an interface in UI to update accumulator ID independently.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
