appletreeisyellow commented on issue #9171:
URL:
https://github.com/apache/arrow-datafusion/issues/9171#issuecomment-1944228016
After discussion with @alamb, we plan to do the implementation in two phases:
1. Turn each sub expression into a case expression
2. Simplify the case expression and make it easy to read
## 1. Turn each sub expression into a case expression
Each sub expression will be rewritten into a case expression instead of
wrapping the entire expression into one case expression. Each sub expression
has its own case expression will make sure the pruning predict rewrite logic is
correct.
For example, `x < 5 AND x > 0 OR y = 10`
will be rewritten into
```sql
# x < 5
CASE
WHEN x_null_count = x_row_count THEN false
ELSE x_max < 5
END
AND
# x > 0
CASE
WHEN x_null_count = x_row_count THEN false
ELSE 0 < x_min
END
OR
# y = 10
CASE
WHEN y_null_count = y_row_count THEN false
ELSE y_min <= 10 AND 10 <= y_max
END
```
However, as you can see from the example above, the final pruning predict
rewrite can be long and hard to read. Therefore, we need phase 2 to improve the
problem.
## 2. Simplify the case expression and make it easy to read
Add format, like `()` and new lines to the expression string. I will have a
better idea after phase 1 PR is done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]