[GitHub] [iceberg] rdblue commented on a diff in pull request #7886: Spark 3.4: Support pushing down system functions by V2 filters

via GitHub Sun, 09 Jul 2023 13:12:57 -0700


rdblue commented on code in PR #7886:
URL: https://github.com/apache/iceberg/pull/7886#discussion_r1257535784



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -302,10 +354,25 @@ private static <T> T childAtIndex(Predicate predicate, 
int index) {
     return (T) predicate.children()[index];
   }
 
+  private static boolean 
couldConvert(org.apache.spark.sql.connector.expressions.Expression expr) {
+    return isRef(expr) || isSystemFunc(expr);
+  }
+
   private static boolean 
isRef(org.apache.spark.sql.connector.expressions.Expression expr) {
     return expr instanceof NamedReference;
   }
 
+  private static boolean 
isSystemFunc(org.apache.spark.sql.connector.expressions.Expression expr) {
+    if (expr instanceof UserDefinedScalarFunc) {
+      UserDefinedScalarFunc udf = (UserDefinedScalarFunc) expr;
+      return udf.canonicalName().startsWith("iceberg")

Review Comment:
   I think the problem is that this is using the UDF's `canonicalName` but the 
other check uses the function's `name`. Those can differ: 
`BucketFunction.name()` return `bucket`, what the user would call, but the 
canonical function name identifies the exact bound function no matter how it is 
loaded so `BucketInt.canonicalName()` returns `iceberg.bucket(int)`.
   
   If we want to limit to just the Iceberg-defined functions, then this is 
necessary. It may be better to have a set of supported functions that uses just 
the `canonicalName`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a diff in pull request #7886: Spark 3.4: Support pushing down system functions by V2 filters

Reply via email to