It is possible that you hit this issue - https://issues.apache.org/jira/browse/HIVE-5973 It is fixed in apache hive 0.13 release.
On Thu, May 1, 2014 at 7:10 PM, Sukhendu Chakraborty <sukhendu.chakrabo...@gmail.com> wrote: > I am seeing very different number of rows in this query output depending on > whether I enable SMB join: > > select count(*) > from dss.hist_hshld_profl_mc a > join > dss.hshld_summary_mc b > on a.hh_key = b.hh_key > where ('2012-02-27' between a.hshld_profl_eff_dt and a.hshld_profl_exp_dt) > and a.hshld_exp_dt='9999-12-31' > and trim(a.cntry_id) = 'USA' > > The SMB join returns 60 rows (wrong value) while the regular join returns > 30million plus rows (correct value). > > Is there a known issue/jira for this? We are using CDH5.0/hive-0.12. > > -Sukhendu -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.