Re: How HIVE manages a join

akshaya iyengar Thu, 12 Aug 2010 14:20:05 -0700

Thank you very much for the links.

Akshaya
On Thu, Aug 12, 2010 at 5:16 PM, Raghu Murthy <[email protected]> wrote:


> The hive.pdf link in the Design page is this one:
> http://www.slideshare.net/namit_jain/hive-demo-paper-at-vldb-2009
>
> A later paper in ICDE'10 is available here:
> http://i.stanford.edu/~ragho/hive-icde2010.pdf<http://i.stanford.edu/%7Eragho/hive-icde2010.pdf>
>
> Both of these papers and others are linked from:
> http://wiki.apache.org/hadoop/Hive/Presentations
>
> Hope this helps.
>
>
> On Aug 12, 2010, at 2:05 PM, akshaya iyengar wrote:
>
> Hello,
> I apologize if this is out of context for the current thread. I was looking
> for the Hive architecture diagram on this page
> http://wiki.apache.org/hadoop/Hive/Design . The pdf link doesnt seem to
> work for me as well.
>
> It would be a great help if someone could direct me to this information.
>
> Thanks,
> Akshaya
>
> On Thu, Aug 12, 2010 at 4:37 PM, Edward Capriolo <[email protected]>wrote:
>
>> Joydeep,
>>
>> I am sorry. I put that when I thought we were going to actively move
>> to xdocs. You an remove that if you like.
>>
>> As i said in a thread before the problem with the wiki is that no one
>> actively updates it. Example:
>>
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/Select
>> oopse: Really what about "in support"...
>> https://issues.apache.org/jira/browse/HIVE-801
>>
>> Which is why I hold the option that all patches except bug fixes
>> should probably come with xdocs, People are free to disagree.
>>
>> Edward
>>
>> On Thu, Aug 12, 2010 at 3:16 AM, Joydeep Sen Sarma <[email protected]>
>> wrote:
>> > i hate this message: 'THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT
>> EDIT!Join Syntax'
>> >
>> > why must edits to the wiki be banned if there are xdocs? hadoop has
>> both.
>> >
>> > there will always be things that are not captured in xdocs. it's pretty
>> sad to discourage free form edits by people who want to contribute without
>> checking out source. (what is this - the 80s?)
>> > ________________________________________
>> > From: Edward Capriolo [[email protected]]
>> > Sent: Tuesday, August 10, 2010 2:57 PM
>> > To: [email protected]
>> > Cc: [email protected]
>> > Subject: Re: How HIVE manages a join
>> >
>> > Sorry.
>> > $hive_root/docs/xdocs/language_manual/joins.xml
>> >
>> > On Tue, Aug 10, 2010 at 5:57 PM, Edward Capriolo <[email protected]>
>> wrote:
>> >> This page is is already in version control..
>> >>
>> >> /home/edward/cassandra-handler/docs/xdocs/language_manual/joins.xml
>> >>
>> >> Edward
>> >>
>> >> On Tue, Aug 10, 2010 at 5:15 PM, Carl Steinbach <[email protected]>
>> wrote:
>> >>> Hi Yongqiang,
>> >>> Please go ahead and update the wiki page. I will copy it over to
>> version
>> >>> control when you are done.
>> >>> Thanks.
>> >>> Carl
>> >>>
>> >>> On Tue, Aug 10, 2010 at 2:11 PM, yongqiang he <
>> [email protected]>
>> >>> wrote:
>> >>>>
>> >>>> In the Hive Join wiki page, it says
>> >>>> "THIS PAGE WAS MOVED TO HIVE XDOCS ! DO NOT EDIT!Join Syntax"
>> >>>>
>> >>>> Where should i do the update?
>> >>>>
>> >>>> On Fri, Aug 6, 2010 at 11:46 PM, yongqiang he <
>> [email protected]>
>> >>>> wrote:
>> >>>> > Yeah. The sort merge bucket mapjoin has been finished for sometime,
>> >>>> > and seems stable now. I did one skew join but haven't get a chance
>> to
>> >>>> > look at another skew join Namit mentioned to me. But definitely
>> should
>> >>>> > update the wiki earlier. My bad.
>> >>>> >
>> >>>> > On Fri, Aug 6, 2010 at 8:32 PM, Jeff Hammerbacher <
>> [email protected]>
>> >>>> > wrote:
>> >>>> >> Yongqiang mentioned he was going to update the wiki with this
>> >>>> >> information in
>> >>>> >> the thread at http://hadoop.markmail.org/thread/hxd4uwwukuo46lgw.
>> >>>> >>
>> >>>> >> Yongqiang, have you gotten a chance to complete the sort merge
>> bucket
>> >>>> >> map
>> >>>> >> join and the other skew join you mention in the above thread?
>> >>>> >>
>> >>>> >> Thanks,
>> >>>> >> Jeff
>> >>>> >>
>> >>>> >> On Fri, Aug 6, 2010 at 3:43 AM, bharath vissapragada
>> >>>> >> <[email protected]> wrote:
>> >>>> >>>
>> >>>> >>> Roberto ..
>> >>>> >>>
>> >>>> >>> You can find these links useful ..
>> >>>> >>>
>> >>>> >>>
>> >>>> >>>
>> >>>> >>>
>> http://www.slideshare.net/ragho/hive-icde-2010?src=related_normal&rel=2374551
>> >>>> >>> - Simple joins and optimizations..
>> >>>> >>>
>> >>>> >>>
>> >>>> >>>
>> http://www.slideshare.net/zshao/hive-user-meeting-march-2010-hive-team  -
>> >>>> >>> New kind of joins / features of hive ..
>> >>>> >>>
>> >>>> >>> Thanks
>> >>>> >>>
>> >>>> >>> Bharath.V
>> >>>> >>> 4th year Undergraduate..
>> >>>> >>> IIIT Hyderabad
>> >>>> >>>
>> >>>> >>> On Fri, Aug 6, 2010 at 12:16 PM, Cappa Roberto
>> >>>> >>> <[email protected]> wrote:
>> >>>> >>>>
>> >>>> >>>> Hi,
>> >>>> >>>>
>> >>>> >>>> I cannot find any documentation about what algorithm performs
>> HIVE to
>> >>>> >>>> translate JOIN clauses to Map-Reduce tasks.
>> >>>> >>>>
>> >>>> >>>> In particular, if I have two tables A and B, each table is
>> written on
>> >>>> >>>> a
>> >>>> >>>> separate file and each file is splitted on hadoop nodes. When I
>> >>>> >>>> perform a
>> >>>> >>>> JOIN with A.column = B.column, the framework has to compare full
>> data
>> >>>> >>>> from
>> >>>> >>>> the first file and full data from the second file. In order to
>> >>>> >>>> perform a
>> >>>> >>>> full scan of all possibile combinations of values, how can
>> hadoop
>> >>>> >>>> perform
>> >>>> >>>> it? If each node contains a portion of each file, it seems not
>> >>>> >>>> possible to
>> >>>> >>>> have a complete comparison. Does one of the two files enterely
>> >>>> >>>> replicated on
>> >>>> >>>> each node? Or, does HIVE use another kind of
>> strategy/optimization?
>> >>>> >>>>
>> >>>> >>>> Thanks.
>> >>>> >>
>> >>>> >>
>> >>>> >
>> >>>
>> >>>
>> >>
>> >
>>
>
>
>

Re: How HIVE manages a join

Reply via email to