hi Bertrand, Thanks for the reply.
My question was every join in a hive query would constitute to a Mapreduce job. Mapreduce job goes through serialization and deserilaization of objects Isnt it a overhead. Store data in the smarter way? can you please elaborate on this. Regards Sudeep On Tue, Aug 14, 2012 at 11:39 AM, Bertrand Dechoux <decho...@gmail.com>wrote: > You may want to be clearer. Is your question : how can I change the > serialization strategy of Hive? (If so I let other users answer and I am > also interested in the answer.) > > Else the answer is simple. If you want to join data which can not be > stored into memory, you need to serialize them. The only solution is to > store the data in a smarter way which would not require you to do the join. > By the way, how do you know the serialisation is the bottleneck? > > Bertrand > > > On Tue, Aug 14, 2012 at 5:11 PM, sudeep tokala <sudeeptok...@gmail.com>wrote: > >> >> >> On Tue, Aug 14, 2012 at 11:08 AM, sudeep tokala >> <sudeeptok...@gmail.com>wrote: >> >>> Hi all, >>> >>> How to avoid serialization and deserialization overhead in hive join >>> query ? will this optimize my query performance. >>> >>> Regards >>> sudeep >>> >> >> > > > -- > Bertrand Dechoux >