Hi Yongqiang, Please go ahead and update the wiki page. I will copy it over to version control when you are done.
Thanks. Carl On Tue, Aug 10, 2010 at 2:11 PM, yongqiang he <heyongqiang...@gmail.com>wrote: > In the Hive Join wiki page, it says > "THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT EDIT!Join Syntax" > > Where should i do the update? > > On Fri, Aug 6, 2010 at 11:46 PM, yongqiang he <heyongqiang...@gmail.com> > wrote: > > Yeah. The sort merge bucket mapjoin has been finished for sometime, > > and seems stable now. I did one skew join but haven't get a chance to > > look at another skew join Namit mentioned to me. But definitely should > > update the wiki earlier. My bad. > > > > On Fri, Aug 6, 2010 at 8:32 PM, Jeff Hammerbacher <ham...@cloudera.com> > wrote: > >> Yongqiang mentioned he was going to update the wiki with this > information in > >> the thread at http://hadoop.markmail.org/thread/hxd4uwwukuo46lgw. > >> > >> Yongqiang, have you gotten a chance to complete the sort merge bucket > map > >> join and the other skew join you mention in the above thread? > >> > >> Thanks, > >> Jeff > >> > >> On Fri, Aug 6, 2010 at 3:43 AM, bharath vissapragada > >> <bhara...@students.iiit.ac.in> wrote: > >>> > >>> Roberto .. > >>> > >>> You can find these links useful .. > >>> > >>> > >>> > http://www.slideshare.net/ragho/hive-icde-2010?src=related_normal&rel=2374551 > >>> - Simple joins and optimizations.. > >>> > >>> http://www.slideshare.net/zshao/hive-user-meeting-march-2010-hive-team > - > >>> New kind of joins / features of hive .. > >>> > >>> Thanks > >>> > >>> Bharath.V > >>> 4th year Undergraduate.. > >>> IIIT Hyderabad > >>> > >>> On Fri, Aug 6, 2010 at 12:16 PM, Cappa Roberto > >>> <roberto.ca...@guest.telecomitalia.it> wrote: > >>>> > >>>> Hi, > >>>> > >>>> I cannot find any documentation about what algorithm performs HIVE to > >>>> translate JOIN clauses to Map-Reduce tasks. > >>>> > >>>> In particular, if I have two tables A and B, each table is written on > a > >>>> separate file and each file is splitted on hadoop nodes. When I > perform a > >>>> JOIN with A.column = B.column, the framework has to compare full data > from > >>>> the first file and full data from the second file. In order to perform > a > >>>> full scan of all possibile combinations of values, how can hadoop > perform > >>>> it? If each node contains a portion of each file, it seems not > possible to > >>>> have a complete comparison. Does one of the two files enterely > replicated on > >>>> each node? Or, does HIVE use another kind of strategy/optimization? > >>>> > >>>> Thanks. > >> > >> > > >