[ 
https://issues.apache.org/jira/browse/HIVE-29657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18088199#comment-18088199
 ] 

Stamatis Zampetakis commented on HIVE-29657:
--------------------------------------------

The idea has occurred while working on the PR of HIVE-28911: 
https://github.com/apache/hive/pull/6503#discussion_r3356514833

> Improve SEARCH expansion by considering negation
> ------------------------------------------------
>
>                 Key: HIVE-29657
>                 URL: https://issues.apache.org/jira/browse/HIVE-29657
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>
> SEARCH is an internal logical operator that is used by CBO to represent many 
> type of range predicates. Currently, it doesn't have a physical equivalent so 
> we have to expand it to existing primitive operators (via 
> [SearchTransformer|https://github.com/apache/hive/blob/45bdea4f8bb62873ab2b2f4a8f7afc4c780d4aa3/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/SearchTransformer.java]).
> A single SEARCH expression can be expanded in different ways with potentially 
> different performance implications. 
> +Example:+
> {noformat}
> SEARCH($0, Sarg[(-∞..10), (10..20), (20..30], [50..+∞)])
> {noformat}
> +Expansion A+
> {code:sql}
> ... WHERE x < 10 OR (x > 10 AND x < 20) OR (x > 20 and x <= 30) OR (x >= 50)
> {code}
> However, negating the SEARCH can gives us simpler and potentially more 
> efficient expressions. The following SEARCH expression is equivalent with the 
> above.
> +Example:+
> {noformat}
> NOT SEARCH($0, Sarg[10, 20, (30..50)])
> {noformat}
> +Expansion B+
> {code:sql}
> ... WHERE NOT ( x = 10 OR x = 20 OR (x > 30 AND x < 50))
> ... WHERE x <> 10 AND x <> 20 AND (x <= 30 OR x >= 50)
> ... WHERE x NOT IN (10, 20) AND (x <= 30 OR x >= 50)
> {code}
> Expansions in category B are undeniably simpler and better in terms of 
> memory/CPU consumption.
> The goal of this ticket is to consider both the positive and negative SEARCH 
> expression during the expansion (e.g., using 
> org.apache.calcite.util.Sarg#negate) and pick the one that is simpler and 
> more efficient. 
> We have to be careful though not to introduce an overly complex and expensive 
> expansion check cause that would beat the entire purpose of this 
> transformation; compilation should remain fast. If that is not possible then 
> we should close this ticket as won't fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to