kbendick commented on a change in pull request #2761:
URL: https://github.com/apache/iceberg/pull/2761#discussion_r678890318



##########
File path: python/iceberg/api/expressions/expression_parser.py
##########
@@ -44,7 +44,7 @@
 ident = Word(alphas, alphanums + "_$").setName("identifier")
 columnName = delimitedList(ident, ".", combine=True).setName("column name")
 
-binop = oneOf("= == != < > >= <= eq ne lt le gt ge <>", caseless=False)
+binop = oneOf("= == != < > >= <= eq ne lt le gt ge <> startsWith", 
caseless=False)

Review comment:
       Would it make sense to allow for `starts_with` in the expression parser 
too? The SQL wouldn't exactly be portable outside of PySpark etc, but users 
might want to use the language natural `starts_with` and we already have `lt` 
`le` and their counterparts `<` `<=` etc.

##########
File path: python/iceberg/api/expressions/inclusive_manifest_evaluator.py
##########
@@ -159,3 +159,7 @@ def in_(self, ref, lit):
 
     def not_in(self, ref, lit):
         return ROWS_MIGHT_MATCH
+
+    def starts_with(self, ref, lit):
+        # if it's all null then we can't match :)
+        return self.not_null(ref)

Review comment:
       Most of the functions seem to explicitly return the constants 
`ROWS_MIGHT_MATCH` and `ROWS_CANNOT_MATCH`. Would this fit in more as `return 
ROWS_CANNOT_MATCH if self.not_null(ref) else ROWS_MIGHT_MATCH` or similar?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to