Hello, I apologize if this is out of context for the current thread. I was looking for the Hive architecture diagram on this page http://wiki.apache.org/hadoop/Hive/Design . The pdf link doesnt seem to work for me as well.
It would be a great help if someone could direct me to this information. Thanks, Akshaya On Thu, Aug 12, 2010 at 4:37 PM, Edward Capriolo <[email protected]>wrote: > Joydeep, > > I am sorry. I put that when I thought we were going to actively move > to xdocs. You an remove that if you like. > > As i said in a thread before the problem with the wiki is that no one > actively updates it. Example: > > http://wiki.apache.org/hadoop/Hive/LanguageManual/Select > oopse: Really what about "in support"... > https://issues.apache.org/jira/browse/HIVE-801 > > Which is why I hold the option that all patches except bug fixes > should probably come with xdocs, People are free to disagree. > > Edward > > On Thu, Aug 12, 2010 at 3:16 AM, Joydeep Sen Sarma <[email protected]> > wrote: > > i hate this message: 'THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT > EDIT!Join Syntax' > > > > why must edits to the wiki be banned if there are xdocs? hadoop has both. > > > > there will always be things that are not captured in xdocs. it's pretty > sad to discourage free form edits by people who want to contribute without > checking out source. (what is this - the 80s?) > > ________________________________________ > > From: Edward Capriolo [[email protected]] > > Sent: Tuesday, August 10, 2010 2:57 PM > > To: [email protected] > > Cc: [email protected] > > Subject: Re: How HIVE manages a join > > > > Sorry. > > $hive_root/docs/xdocs/language_manual/joins.xml > > > > On Tue, Aug 10, 2010 at 5:57 PM, Edward Capriolo <[email protected]> > wrote: > >> This page is is already in version control.. > >> > >> /home/edward/cassandra-handler/docs/xdocs/language_manual/joins.xml > >> > >> Edward > >> > >> On Tue, Aug 10, 2010 at 5:15 PM, Carl Steinbach <[email protected]> > wrote: > >>> Hi Yongqiang, > >>> Please go ahead and update the wiki page. I will copy it over to > version > >>> control when you are done. > >>> Thanks. > >>> Carl > >>> > >>> On Tue, Aug 10, 2010 at 2:11 PM, yongqiang he < > [email protected]> > >>> wrote: > >>>> > >>>> In the Hive Join wiki page, it says > >>>> "THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT EDIT!Join Syntax" > >>>> > >>>> Where should i do the update? > >>>> > >>>> On Fri, Aug 6, 2010 at 11:46 PM, yongqiang he < > [email protected]> > >>>> wrote: > >>>> > Yeah. The sort merge bucket mapjoin has been finished for sometime, > >>>> > and seems stable now. I did one skew join but haven't get a chance > to > >>>> > look at another skew join Namit mentioned to me. But definitely > should > >>>> > update the wiki earlier. My bad. > >>>> > > >>>> > On Fri, Aug 6, 2010 at 8:32 PM, Jeff Hammerbacher < > [email protected]> > >>>> > wrote: > >>>> >> Yongqiang mentioned he was going to update the wiki with this > >>>> >> information in > >>>> >> the thread at http://hadoop.markmail.org/thread/hxd4uwwukuo46lgw. > >>>> >> > >>>> >> Yongqiang, have you gotten a chance to complete the sort merge > bucket > >>>> >> map > >>>> >> join and the other skew join you mention in the above thread? > >>>> >> > >>>> >> Thanks, > >>>> >> Jeff > >>>> >> > >>>> >> On Fri, Aug 6, 2010 at 3:43 AM, bharath vissapragada > >>>> >> <[email protected]> wrote: > >>>> >>> > >>>> >>> Roberto .. > >>>> >>> > >>>> >>> You can find these links useful .. > >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > http://www.slideshare.net/ragho/hive-icde-2010?src=related_normal&rel=2374551 > >>>> >>> - Simple joins and optimizations.. > >>>> >>> > >>>> >>> > >>>> >>> > http://www.slideshare.net/zshao/hive-user-meeting-march-2010-hive-team - > >>>> >>> New kind of joins / features of hive .. > >>>> >>> > >>>> >>> Thanks > >>>> >>> > >>>> >>> Bharath.V > >>>> >>> 4th year Undergraduate.. > >>>> >>> IIIT Hyderabad > >>>> >>> > >>>> >>> On Fri, Aug 6, 2010 at 12:16 PM, Cappa Roberto > >>>> >>> <[email protected]> wrote: > >>>> >>>> > >>>> >>>> Hi, > >>>> >>>> > >>>> >>>> I cannot find any documentation about what algorithm performs > HIVE to > >>>> >>>> translate JOIN clauses to Map-Reduce tasks. > >>>> >>>> > >>>> >>>> In particular, if I have two tables A and B, each table is > written on > >>>> >>>> a > >>>> >>>> separate file and each file is splitted on hadoop nodes. When I > >>>> >>>> perform a > >>>> >>>> JOIN with A.column = B.column, the framework has to compare full > data > >>>> >>>> from > >>>> >>>> the first file and full data from the second file. In order to > >>>> >>>> perform a > >>>> >>>> full scan of all possibile combinations of values, how can hadoop > >>>> >>>> perform > >>>> >>>> it? If each node contains a portion of each file, it seems not > >>>> >>>> possible to > >>>> >>>> have a complete comparison. Does one of the two files enterely > >>>> >>>> replicated on > >>>> >>>> each node? Or, does HIVE use another kind of > strategy/optimization? > >>>> >>>> > >>>> >>>> Thanks. > >>>> >> > >>>> >> > >>>> > > >>> > >>> > >> > > >
