zhztheplayer commented on code in PR #5433:
URL: https://github.com/apache/incubator-gluten/pull/5433#discussion_r1572009913
##########
gluten-ut/spark33/src/test/scala/org/apache/spark/sql/GlutenBloomFilterAggregateQuerySuite.scala:
##########
@@ -113,4 +113,37 @@ class GlutenBloomFilterAggregateQuerySuite
}
}
}
+
+ testGluten("Test bloom_filter_agg fallback with might_contain offloaded") {
+ val table = "bloom_filter_test"
+ val numEstimatedItems = 5000000L
+ val numBits = GlutenConfig.getConf.veloxBloomFilterMaxNumBits
+ val sqlString = s"""
+ |SELECT col positive_membership_test
+ |FROM $table
+ |WHERE might_contain(
+ | (SELECT bloom_filter_agg(col,
+ | cast($numEstimatedItems as long),
+ | cast($numBits as long))
+ | FROM $table), col)
+ """.stripMargin
+
+ withTempView(table) {
+ (Seq(Long.MinValue, 0, Long.MaxValue) ++ (1L to 200000L))
+ .toDF("col")
+ .createOrReplaceTempView(table)
+ withSQLConf(
+ GlutenConfig.COLUMNAR_HASHAGG_ENABLED.key -> "false"
Review Comment:
> If this is the only case that triggers bloom_filter_agg fallback?
Probably there are still some cases making agg fallback, e.g., validation
failures by other agg functions. Since the agg and might_contain are not in the
same query/sub-query, plus taking AQE on/off and other
validation/transformation rules into account, doing such co-fallback can be a
very dirty work. Let's continue with the new approach introduced in
https://github.com/apache/incubator-gluten/pull/5435 to let vanilla Spark be
able to run Velox's bloom filter then we can thoroughly solve all the issues
related to bloom filter mismatch including these fallback problems.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]