Re: Optimized Hive query

Mich Talebzadeh Tue, 14 Jun 2016 01:05:13 -0700

I presume the user is concerned with performance?

The whole use case of a CBO is to take care of queries by finding the
optimum access path.

otherwise we would have a RBO as is in the old days of Hive.

If you are in the more recent version of Hive CBO does the job.

However, you may think of moving from map-reduce execution engine to
something like Spark to accelerate the whole thing.

Alternatively use Spark for the query on Hive (assuming that you are
familiar with the product) to do the whole thing (CBO + execution).

Hive is pretty mature. Hive on map-reduce is problematic.

HTH

Dr Mich Talebzadeh

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

http://talebzadehmich.wordpress.com

On 14 June 2016 at 08:37, Aviral Agarwal <aviral12...@gmail.com> wrote:

> Hi,
> Thanks for the replies.
> I already knew that the optimizer already does that.
> My usecase is a bit different though.
> I want to display the flattened query back to the user.
> So I was hoping of using internal Hive CBO to somehow change the AST
> generated for the query somehow.
>
> Thanks,
> Aviral
>
> On Tue, Jun 14, 2016 at 12:42 PM, Gopal Vijayaraghavan <gop...@apache.org>
> wrote:
>
>>
>> > You can see that you get identical execution plans for the nested query
>> >and the flatten one.
>>
>> Wasn't that always though. Back when I started with Hive, before Stinger,
>> it didn't have the identity project remover.
>>
>> To know if your version has this fix, try looking at
>>
>> hive> set hive.optimize.remove.identity.project;
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>
>>
>>
>

Re: Optimized Hive query

Reply via email to