[
https://issues.apache.org/jira/browse/TEZ-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478030#comment-17478030
]
László Bodor edited comment on TEZ-4377 at 1/18/22, 4:52 PM:
-------------------------------------------------------------
[~inza9hi]: thanks for reporting this
even if it works on MR, this kind of correctness problem usually has the root
cause in hive, I recommend you to file a ticket there
some notes:
1. isn't it possible to create a qtest for reproducing this on hive side? I
mean, if the tables are of ~20MBs size, could you give data files that
reproduce the issue (if it contains sensitive data, you can mask it first)
2. could this query be simplified further? there are lots of left joins, can
you check if a simpler query reproduces the same problem?
3. other hive tweaks to find the area of the root cause: disable vectorization
(hive.vectorized.execution.enabled=false)
was (Author: abstractdog):
[~inza9hi]: thanks for reporting this
even if it works on MR, this kind of correctness problem have the root cause in
hive, I recommend you to file a ticket there
some notes:
1. isn't it possible to create a qtest for reproducing this on hive side? I
mean, if the tables are of ~20MBs size, could you give data files that
reproduce the issue (if it contains sensitive data, you can mask it first)
2. could this query be simplified further? there are lots of left joins, can
you check if a simpler query reproduces the same problem?
3. other hive tweaks to find the area of the root cause: disable vectorization
(hive.vectorized.execution.enabled=false)
> Got wrong result when using inner join and left join and window function
> ------------------------------------------------------------------------
>
> Key: TEZ-4377
> URL: https://issues.apache.org/jira/browse/TEZ-4377
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.9.1, 0.10.1
> Reporter: inza9hi
> Priority: Major
>
> I am using Hive 3.1.0 and Tez 0.9.1.
> The sql looks just like this:
>
> {code:java}
> select
> count(innerCode),
> count(esdate)
> from
> (
> select t1.secuCode
> ,t1.innercode
> from
> (select innerCode
> ,secuCode
> ,cate
> from tmp.test_a1
> where cate in (8,13)
> ) t1
> inner join
> (
> select secuCode
> from tmp.test_a2
> where type not in ('pb','pbf','licai')
> ) t2
> on t1.secuCode = t2.secuCode
> ) a
> left join
> (
> select innercode
> ,esdate
>
> from tmp.test_b
> ) e on a.innerCode = e.innercode
> left join
> (
> select innerCode
> ,1 as if_gold_prize
> from tmp.test_c
> where winnertype = 1
> group by innerCode
> ) f1 on a.innerCode = f1.innerCode
> left join
> (
> select innercode
> ,endate enddate1
> from
> (
> select *
> ,row_number() over (partition by innercode order by endate
> desc) max_date
> from tmp.test_d
> ) t
> where max_date = 1
> ) m on a.innerCode = m.innercode
> left join
> (
> select innercode
> ,endate enddate2
> from
> (
> select *
> ,row_number() over (partition by innercode order by endate
> desc) max_date
> from tmp.test_e
> ) t
> where max_date = 1
> ) n on a.innerCode = n.innercode {code}
> The num of the records is right, but ** in the result there are lots of null
> *esdate* ( from the first left join) which {*}should not be null{*}.
> Sorry I can not reproduce the issue with small data sets.
> But something I can give:
> # Most of the tables are very small (about 20MB)
> # Either I remove the inner join or remove any left join, the result is
> right.
> # Disable the map join and cbo did not work either (
> --hiveconf hive.auto.convert.join.noconditionaltask =false --hiveconf
> hive.auto.convert.join=false --hiveconf hive.cbo.enable=false)
> # I tried the latest tez release (0.10.1), there is still the issue
> The sql works well on MR, so i think it is a Tez issue. I am not very
> familiar with Tez. Any suggestion to find the root cause?
> Thank you in advance.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)