Thanks. But this seems to happen for a partitioned bucketed table with
subqueries. While my use case is a basic join of non partitioned bucketed
tables. I will try the patch and let you know.
-Sukhendu
On May 2, 2014 12:10 PM, "Thejas Nair" <the...@hortonworks.com> wrote:

> It is possible that you hit this issue  -
> https://issues.apache.org/jira/browse/HIVE-5973
> It is fixed in apache hive 0.13 release.
>
>
> On Thu, May 1, 2014 at 7:10 PM, Sukhendu Chakraborty
> <sukhendu.chakrabo...@gmail.com> wrote:
> > I am seeing very different number of rows in this query output depending
> on
> > whether I enable SMB join:
> >
> > select count(*)
> > from dss.hist_hshld_profl_mc  a
> >           join
> >           dss.hshld_summary_mc     b
> >        on a.hh_key = b.hh_key
> >  where ('2012-02-27' between a.hshld_profl_eff_dt and
> a.hshld_profl_exp_dt)
> >       and a.hshld_exp_dt='9999-12-31'
> >    and trim(a.cntry_id) = 'USA'
> >
> > The SMB join returns 60 rows (wrong value) while the regular join returns
> > 30million plus rows (correct value).
> >
> > Is there a known issue/jira for this? We are using CDH5.0/hive-0.12.
> >
> > -Sukhendu
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Reply via email to