Re: OPTIMIZING A HIVE QUERY

sudeep tokala Tue, 14 Aug 2012 10:30:46 -0700

hi Bertrand,

Thanks for the reply.


My question was every join in a hive query would constitute to a Mapreduce
job.
Mapreduce job goes through serialization and deserilaization of objects
Isnt it a overhead.

Store data in the smarter way? can you please elaborate on this.

Regards
Sudeep

On Tue, Aug 14, 2012 at 11:39 AM, Bertrand Dechoux <decho...@gmail.com>wrote:

> You may want to be clearer. Is your question : how can I change the
> serialization strategy of Hive? (If so I let other users answer and I am
> also interested in the answer.)
>
> Else the answer is simple. If you want to join data which can not be
> stored into memory, you need to serialize them. The only solution is to
> store the data in a smarter way which would not require you to do the join.
> By the way, how do you know the serialisation is the bottleneck?
>
> Bertrand
>
>
> On Tue, Aug 14, 2012 at 5:11 PM, sudeep tokala <sudeeptok...@gmail.com>wrote:
>
>>
>>
>> On Tue, Aug 14, 2012 at 11:08 AM, sudeep tokala 
>> <sudeeptok...@gmail.com>wrote:
>>
>>> Hi all,
>>>
>>> How to avoid serialization and deserialization overhead in hive join
>>> query ? will this optimize my query performance.
>>>
>>> Regards
>>> sudeep
>>>
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: OPTIMIZING A HIVE QUERY

Reply via email to