[
https://issues.apache.org/jira/browse/HIVE-29262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034313#comment-18034313
]
KIRTI RUGE commented on HIVE-29262:
-----------------------------------
If you see below query
SELECT
*ws_bill_customer_sk,*
*ws_item_sk,*
ws_sold_date_sk,
ws_sales_price,
LEAD(ws_sales_price) OVER (
*PARTITION BY ws_item_sk, ws_bill_customer_sk*
ORDER BY ws_sold_date_sk
) AS next_sales_price,
LEAD(ws_sales_price) OVER (
PARTITION BY ws_bill_customer_sk, ws_item_sk
ORDER BY ws_sold_date_sk
) - ws_sales_price AS sales_price_diff
FROM
web_sales_txt
The order of columns in SELECT clause and PARTITION BY is different. In this
case result gets messed up. Attached both output files
[^leadlag_Vectorization_ve_false.q.out]
> LEAD/LAG functions shows incorrect data when vectorization enabled
> ------------------------------------------------------------------
>
> Key: HIVE-29262
> URL: https://issues.apache.org/jira/browse/HIVE-29262
> Project: Hive
> Issue Type: Bug
> Reporter: KIRTI RUGE
> Priority: Major
> Labels: correctness, needs-repro-steps
> Attachments: leadlag_Vectorization.q,
> leadlag_Vectorization_ve_false.q.out, leadlag_Vectorization_vect_true.q.out
>
>
>
> Please use this leadlag_Vectorization.q file to repro this scenario.
> This case happnes when table has more columns.
> mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=leadlag_Vectorization.q
--
This message was sent by Atlassian Jira
(v8.20.10#820010)