[ 
https://issues.apache.org/jira/browse/CALCITE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-838:
--------------------------------
    Summary: Optimize "join to most recent" queries  (was: Optimize join-to-MAX 
queries)

> Optimize "join to most recent" queries
> --------------------------------------
>
>                 Key: CALCITE-838
>                 URL: https://issues.apache.org/jira/browse/CALCITE-838
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>
> A common class of queries joins a fact table to an "current" record in a 
> dimension table. What appears to be a many-to-many join becomes many-to-one 
> because there is only one current record.
> Here the query on orders (fact) and customers (dimension): {code}
> SELECT *
> FROM Orders
> JOIN Customers AS c USING (customerId)
> WHERE NOT EXISTS (
>   SELECT 1
>   FROM Customers
>   WHERE customerId = c.CustomerId 
>   AND effectiveDate > c.effectiveDate){code}
> Same query pushing WHERE into JOIN:{code}
> SELECT *
> FROM Orders AS o
> JOIN Customers AS c
> ON o.customerId = c.customerId
> AND NOT EXISTS (
>   SELECT 1
>   FROM Customers
>   WHERE customerId = c.CustomerId 
>   AND effectiveDate > c.effectiveDate){code}
> The same query re-formulated using LAST_VALUE:{code}
> SELECT *
> FROM Orders AS o
> JOIN (
>   SELECT last_value(customerId) OVER w AS customerId,
>     last_value(customerName) OVER w AS customerName
>   FROM Customers AS c
>   WINDOW w AS (PARTITION BY customerId
>     ORDER BY effectiveDate DESC)){code}
> This formulation avoids the self-join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to