westonpace commented on issue #33683:
URL: https://github.com/apache/arrow/issues/33683#issuecomment-1545920015

   > Maybe it’s an overkill but would using the filter subset of substrait work?
   
   That probably is overkill though it would work if someone had a desire.  I 
believe bloom filters are only useful for equality / inequality.  The 
statistics support comparison.  So you probably just need =,!=,<,>,<=,>=.  The 
simplest thing to do might be to do what we used to do for the old python 
datasets and accept disjunctive normal form:
   
   > Predicates are expressed using an Expression or using the disjunctive 
normal form (DNF), like [[('x', '=', 0), ...], ...]. DNF allows arbitrary 
boolean logical combinations of single column predicates. The innermost tuples 
each describe a single column predicate. The list of inner predicates is 
interpreted as a conjunction (AND), forming a more selective and multiple 
column predicate. Finally, the most outer list combines these filters as a 
disjunction (OR).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to