hvanhovell commented on code in PR #40352: URL: https://github.com/apache/spark/pull/40352#discussion_r1131455328
########## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ########## @@ -1073,6 +1074,12 @@ class SparkConnectPlanner(val session: SparkSession) { } Some(Lead(children.head, children(1), children(2), ignoreNulls)) + case "bloom_filter_agg" if fun.getArgumentsCount == 3 => + val children = fun.getArgumentsList.asScala.toSeq.map(transformExpression) + Some( + new BloomFilterAggregate(children.head, children(1), children(2)) Review Comment: There is a small issue here. The aggregate requires the first input to be a Long. `DataFrameStatFunctions.bloomFilter` supports `Byte`, `Short`, `Int`, `Long`, and `String`. While we can simply add a cast to long for the first 4, string will be an issue. We need adapt the BloomFilterAggregate to make it fully compatible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org