[
https://issues.apache.org/jira/browse/HIVE-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803343#comment-14803343
]
Jesus Camacho Rodriguez commented on HIVE-11726:
------------------------------------------------
[~jpullokkaran], it seems there is duplicated logic between HIVE-11684 and
HIVE-11726 to extract the partition columns from the IN clause; it is something
we expected.
However, after talking to [~hsubramaniyan] and thinking on both patches, it
doesn't seem logical to keep this logic in both places, as we will not be able
to bail out quickly in HIVE-11684; in fact, we will end up reapplying the same
logic to add additional operands to the predicate.
I suggest to go ahead with HIVE-11684, and that HIVE-11726 only contains the
logic to push predicates comprising IN/STRUCT to the metastore filter. Thus,
PointLookupOptimizer will only contain the logic to transform OR/AND predicates
into IN predicates.
What do you think? [~hsubramaniyan], what's your take on this?
> Pushed IN predicates created by PointLookupOptimizer to the metastore
> ---------------------------------------------------------------------
>
> Key: HIVE-11726
> URL: https://issues.apache.org/jira/browse/HIVE-11726
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11726.patch
>
>
> The PointLookupOptimizer can turn off some of the optimizations due to its
> use of tuple IN() clauses.
> HIVE-11573 introduced the extraction of sub-clauses that could be pushed down
> till the TableScan operators, though they wouldn't be pushed down to the
> metastore.
> In this issue, we tackle this problem by:
> 1) Grouping the columns in the sub-clauses depending on their lineage. This
> way PPD will be able to push them down throw the plan without any extension.
> For instance, if a, b, and c are partition columns, a and b belong to table1,
> and c belong to table2:
> {code}
> (a,b,c) IN ((1,2,3),(2,3,4)) ->
> (a,b) IN ((1,2),(2,3)) and c in (3,4) and (a,b,c) IN
> ((1,2,3),(2,3,4))
> {code}
> 2) Extending the filter parser of the metastore to support IN clauses,
> including multiple columns. This allows to push those additional predicates
> down throw directSQL to the metastore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)