[ 
https://issues.apache.org/jira/browse/IMPALA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-1728:
------------------------------
    Labels: performance planner tpc-ds  (was: TPC-DS performance planner)

> sub-query with duplicate values used IN conditional operator should discard 
> the duplicate values before applying the operator
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-1728
>                 URL: https://issues.apache.org/jira/browse/IMPALA-1728
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>    Affects Versions: Impala 2.0, Impala 2.1
>            Reporter: Dileep Kumar
>            Priority: Minor
>              Labels: performance, planner, tpc-ds
>         Attachments: q95.sql, q95.sql.DISTINCT
>
>
> When running the TPC-DS Q95 we found that it usages a result of CTE in IN 
> conditional later in query.
> In this case CTE generates too many duplicate values for the same column 
> which is used in conditional. When applied the DISTINCT to CTE it took 40% 
> less time to complete.
> The timings(in Sec.) are as:
> Without DISTINCT : 1240
> With DISTINCT : 728
> Both versions of the query are attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to