kbendick edited a comment on pull request #2081:
URL: https://github.com/apache/iceberg/pull/2081#issuecomment-759369189


   > I see [Anton] added the check for `length > 0`, was there any particular 
reason for that? Do you expect `length == 0` to be handled before the pushdown?
   
   As for whether or not length == 0 should be handled before the pushdown, it 
should be if I understand correctly. By the time that this function is invoked 
in the places I listed elsewhere, all expressions should have been rewritten to 
remove NOT nodes, and then been bound. Given that this is always called either 
in metrics evaluation or row value evaluation against a predicate - START_WITH 
or NOT_STARTS_WITH, the expressions being evaluated are all 
`BoundLiteralPredicates` and the input of an empty predicate in that situation 
(e.g. `data not like '%'`) is not necessarily an invalid expression, though it 
does get rewritten to be `data != ''` via expression parsing, binding, and 
rewriting. I'm not as closely familiar with the `min` and `max` functions, but 
my understanding is that they too should be under these same conditions and 
only applied to parquet metadata after expressions have also become rewritten 
`BoundLiteralPredicate`s.
   
   Please correct me if I am mistaken 🙂 .
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to