ulysses-you commented on code in PR #39624: URL: https://github.com/apache/spark/pull/39624#discussion_r1122540925
########## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/Materializable.scala: ########## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.adaptive + +import java.util.concurrent.atomic.AtomicReference + +import scala.concurrent.Future + +import org.apache.spark.sql.catalyst.plans.logical.Statistics +import org.apache.spark.sql.execution.LeafExecNode + +/** + * A materializable is an independent subgraph of the query plan. AQE framework will materialize its + * output before proceeding with further operators of the query plan. The data statistics of the + * materialized output can be used to optimize subsequent query stages. + */ +trait Materializable extends LeafExecNode { + /** + * Materialize this query stage, to prepare for the execution, like submitting map stages, + * broadcasting data, etc. The caller side can use the returned [[Future]] to wait until this + * stage is ready. + */ + def doMaterialize(): Future[Any] + + /** + * Cancel the stage materialization if in progress; otherwise do nothing. + */ + def cancel(): Unit + + /** + * Materialize this query stage, to prepare for the execution, like submitting map stages, + * broadcasting data, etc. The caller side can use the returned [[Future]] to wait until this + * stage is ready. + */ + final def materialize(): Future[Any] = { + logDebug(s"Materialize query stage ${this.getClass.getSimpleName}: $id") Review Comment: It is defined at `SparkPlan`, that's the one reason of why wrap a new `MaterializableQueryStageExec` which overrides `id`. Otherwise the id of SparkPlan would cause conflict with id of query stage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
