Re: How HIVE manages a join

bharath vissapragada Fri, 06 Aug 2010 03:44:20 -0700

Roberto ..

You can find these links useful ..

http://www.slideshare.net/ragho/hive-icde-2010?src=related_normal&rel=2374551-
Simple joins and optimizations..

http://www.slideshare.net/zshao/hive-user-meeting-march-2010-hive-team  -
New kind of joins / features of hive ..

Thanks

Bharath.V
4th year Undergraduate..
IIIT Hyderabad

On Fri, Aug 6, 2010 at 12:16 PM, Cappa Roberto <
[email protected]> wrote:

> Hi,
>
> I cannot find any documentation about what algorithm performs HIVE to
> translate JOIN clauses to Map-Reduce tasks.
>
> In particular, if I have two tables A and B, each table is written on a
> separate file and each file is splitted on hadoop nodes. When I perform a
> JOIN with A.column = B.column, the framework has to compare full data from
> the first file and full data from the second file. In order to perform a
> full scan of all possibile combinations of values, how can hadoop perform
> it? If each node contains a portion of each file, it seems not possible to
> have a complete comparison. Does one of the two files enterely replicated on
> each node? Or, does HIVE use another kind of strategy/optimization?
>
> Thanks.

Re: How HIVE manages a join

Reply via email to