This is an automated email from the ASF dual-hosted git repository.
yuanzhou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git
The following commit(s) were added to refs/heads/main by this push:
new f704f09285 [GLUTEN-7143][VL] RAS: Remove experimental flags for RAS
(#8154)
f704f09285 is described below
commit f704f092855d664def191935c03d3c284b97f218
Author: Hongze Zhang <[email protected]>
AuthorDate: Thu Dec 5 16:04:39 2024 +0800
[GLUTEN-7143][VL] RAS: Remove experimental flags for RAS (#8154)
Part of #7143 (not to close it)
RAS is considered production-ready now as all Spark / Gluten UTs, TPC-H,
TPC-DS ITs have passed for a period of time. More documentations needed but we
can remove the experimental flags to state it's ready for use.
The patch removes the flags from doc and code.
---
docs/Configuration.md | 2 +-
.../gluten/extension/columnar/enumerated/EnumeratedApplier.scala | 7 ++-----
shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala | 7 ++++---
3 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/docs/Configuration.md b/docs/Configuration.md
index cb8efe802e..e217be45ff 100644
--- a/docs/Configuration.md
+++ b/docs/Configuration.md
@@ -23,7 +23,7 @@ You can add these configurations into spark-defaults.conf to
enable or disable t
| spark.shuffle.manager | To turn on
Gluten Columnar Shuffle Plugin
[...]
| spark.gluten.enabled | Enable
Gluten, default is true. Just an experimental property. Recommend to
enable/disable Gluten through the setting for `spark.plugins`.
[...]
| spark.gluten.memory.isolation |
(Experimental) Enable isolated memory mode. If true, Gluten controls the
maximum off-heap memory can be used by each task to X, X = executor memory /
max task slots. It's recommended to set true if Gluten serves concurrent
queries within a single session, since not all memory Gluten allocated is
guaranteed to be spillable. In the case, the feature should be enabled to avoid
OOM. Note when true, setting spark.memory.storageF [...]
-| spark.gluten.ras.enabled | Experimental:
Enables RAS (relation algebra selector) during physical planning to generate
more efficient query plan. Note, this feature is still in development and may
not bring performance profits.
[...]
+| spark.gluten.ras.enabled | Enables RAS
(relation algebra selector) during physical planning to generate more efficient
query plan. Note, this feature doesn't bring performance profits by default.
Try exploring option `spark.gluten.ras.costModel` for advanced usage.
[...]
| spark.gluten.sql.columnar.maxBatchSize | Number of
rows to be processed in each batch. Default value is 4096.
[...]
| spark.gluten.sql.columnar.scanOnly | When enabled,
this config will overwrite all other operators' enabling, and only Scan and
Filter pushdown will be offloaded to native.
[...]
| spark.gluten.sql.columnar.batchscan | Enable or
Disable Columnar BatchScan, default is true
[...]
diff --git
a/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/EnumeratedApplier.scala
b/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/EnumeratedApplier.scala
index ff822d1c0a..04cd70656e 100644
---
a/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/EnumeratedApplier.scala
+++
b/gluten-core/src/main/scala/org/apache/gluten/extension/columnar/enumerated/EnumeratedApplier.scala
@@ -21,7 +21,6 @@ import
org.apache.gluten.extension.columnar.ColumnarRuleApplier.ColumnarRuleCall
import org.apache.gluten.extension.util.AdaptiveContext
import org.apache.gluten.logging.LogLevelUtil
-import org.apache.spark.annotation.Experimental
import org.apache.spark.internal.Logging
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.rules.Rule
@@ -31,11 +30,9 @@ import org.apache.spark.sql.execution.SparkPlan
* Columnar rule applier that optimizes, implements Spark plan into Gluten
plan by enumerating on
* all the possibilities of executable Gluten plans, then choose the best plan
among them.
*
- * NOTE: This is still working in progress. We still have a bunch of heuristic
rules in this
- * implementation's rule list. Future work will include removing them from the
list then
- * implementing them in EnumeratedTransform.
+ * NOTE: We still have a bunch of heuristic rules in this implementation's
rule list. Future work
+ * will include removing them from the list then implementing them in
EnumeratedTransform.
*/
-@Experimental
class EnumeratedApplier(
session: SparkSession,
ruleBuilders: Seq[ColumnarRuleCall => Rule[SparkPlan]])
diff --git a/shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala
b/shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala
index 9ae4c0ce90..4f243f03fb 100644
--- a/shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala
+++ b/shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala
@@ -1416,9 +1416,10 @@ object GlutenConfig {
val RAS_ENABLED =
buildConf("spark.gluten.ras.enabled")
.doc(
- "Experimental: Enables RAS (relational algebra selector) during
physical " +
- "planning to generate more efficient query plan. Note, this feature
is still in " +
- "development and may not bring performance profits.")
+ "Enables RAS (relational algebra selector) during physical " +
+ "planning to generate more efficient query plan. Note, this feature
doesn't bring " +
+ "performance profits by default. Try exploring option
`spark.gluten.ras.costModel` " +
+ "for advanced usage.")
.booleanConf
.createWithDefault(false)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]