zhztheplayer commented on code in PR #11657:
URL: 
https://github.com/apache/incubator-gluten/pull/11657#discussion_r2883376410


##########
cpp/velox/config/VeloxConfig.h:
##########
@@ -78,6 +78,10 @@ const std::string kHashProbeDynamicFilterPushdownEnabled =
 const std::string kHashProbeBloomFilterPushdownMaxSize =
     
"spark.gluten.sql.columnar.backend.velox.hashProbe.bloomFilterPushdown.maxSize";
 
+const std::string kValueStreamDynamicFilterEnabled =
+    
"spark.gluten.sql.columnar.backend.velox.valueStream.dynamicFilter.enabled";
+const bool kValueStreamDynamicFilterEnabledDefault = true;

Review Comment:
   Should it be false?



##########
gluten-substrait/src/main/scala/org/apache/gluten/backendsapi/IteratorApi.scala:
##########
@@ -85,6 +85,7 @@ trait IteratorApi {
       updateNativeMetrics: IMetrics => Unit,
       partitionIndex: Int,
       materializeInput: Boolean = false,
-      enableCudf: Boolean = false): Iterator[ColumnarBatch]
+      enableCudf: Boolean = false,
+      disableValueStreamDynamicFilter: Boolean = false): 
Iterator[ColumnarBatch]

Review Comment:
   Would it be a bit clearer to reword the parameter as `supports...`?



##########
gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala:
##########
@@ -257,7 +258,34 @@ case class WholeStageTransformer(child: SparkPlan, 
materializeInput: Boolean = f
       PlanBuilder.makePlan(substraitContext, 
Lists.newArrayList(childCtx.root), outNames)
     }
 
-    WholeStageTransformContext(planNode, substraitContext, isCudf)
+    WholeStageTransformContext(
+      planNode,
+      substraitContext,
+      isCudf,
+      hasNonDeterministicExprInJoinProbe(child))
+  }
+
+  /**
+   * Checks whether any HashJoin's probe (streamed) side contains 
non-deterministic expressions.
+   * When true, ValueStream dynamic filter pushdown must be disabled because 
the dynamic filter
+   * would be applied at the scan level (below the non-deterministic Project), 
changing how many
+   * times the non-deterministic expression is evaluated and thus altering its 
output sequence. See
+   * SPARK-10316.
+   */

Review Comment:
   > ValueStream dynamic filter pushdown must be disabled because the dynamic 
filter would be applied at the scan level
   
   If a ValueStream dynamic filter is disabled, we won't have a scan to accept 
the dynamic filter - so am I missing something here?



##########
backends-velox/src/main/scala/org/apache/gluten/config/VeloxConfig.scala:
##########
@@ -468,6 +471,14 @@ object VeloxConfig extends ConfigRegistry {
       .booleanConf
       .createWithDefault(true)
 
+  val VALUE_STREAM_DYNAMIC_FILTER_ENABLED =
+    
buildConf("spark.gluten.sql.columnar.backend.velox.valueStream.dynamicFilter.enabled")
+      .doc(
+        "Whether to apply dynamic filters pushed down from hash probe in the 
ValueStream" +
+          " (shuffle reader) operator to filter rows before they reach the 
hash join.")
+      .booleanConf
+      .createWithDefault(true)

Review Comment:
   Should it be false?



##########
backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxMetricsApi.scala:
##########
@@ -628,6 +628,12 @@ class VeloxMetricsApi extends MetricsApi with Logging {
       "hashProbeDynamicFiltersProduced" -> SQLMetrics.createMetric(
         sparkContext,
         "number of hash probe dynamic filters produced"),
+      "valueStreamDynamicFiltersAccepted" -> SQLMetrics.createMetric(
+        sparkContext,
+        "number of dynamic filters accepted by value stream"),
+      "valueStreamDynamicFilteredRows" -> SQLMetrics.createMetric(
+        sparkContext,
+        "number of rows filtered by value stream dynamic filter"),

Review Comment:
   Thanks for adding the metrics. Can you share a screenshot of Spark UI?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to