Thanks. But this seems to happen for a partitioned bucketed table with subqueries. While my use case is a basic join of non partitioned bucketed tables. I will try the patch and let you know. -Sukhendu On May 2, 2014 12:10 PM, "Thejas Nair" <the...@hortonworks.com> wrote:
> It is possible that you hit this issue - > https://issues.apache.org/jira/browse/HIVE-5973 > It is fixed in apache hive 0.13 release. > > > On Thu, May 1, 2014 at 7:10 PM, Sukhendu Chakraborty > <sukhendu.chakrabo...@gmail.com> wrote: > > I am seeing very different number of rows in this query output depending > on > > whether I enable SMB join: > > > > select count(*) > > from dss.hist_hshld_profl_mc a > > join > > dss.hshld_summary_mc b > > on a.hh_key = b.hh_key > > where ('2012-02-27' between a.hshld_profl_eff_dt and > a.hshld_profl_exp_dt) > > and a.hshld_exp_dt='9999-12-31' > > and trim(a.cntry_id) = 'USA' > > > > The SMB join returns 60 rows (wrong value) while the regular join returns > > 30million plus rows (correct value). > > > > Is there a known issue/jira for this? We are using CDH5.0/hive-0.12. > > > > -Sukhendu > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >