Got it. Thanks for your reply.
Zhiwen Sun On Wed, Dec 23, 2015 at 2:24 PM, Gopal Vijayaraghavan <gop...@apache.org> wrote: > > > But why disable mapjoin has better performance when we don't use cast to > >string(user always lazy)? > > > > Join key values comparison in in reduce stage is more quickly? > > The HashMap<DoubleWritable, RowContainer> is slower than the full-sort + > sorted-merge-join. > > > It shouldn't be, but it hits the worst-case performance for the Hashmap > impl because of a bug in DoubleWritable in Hadoop. > > The effect is somewhat the same as > > public int hashCode() { > return 1; > } > > Read the comments on - https://issues.apache.org/jira/browse/HADOOP-12217 > > Cheers, > Gopal > > > > > > >