Re: Want to improve the performance for execution of Hive Jobs.

Nitin Pawar Mon, 07 May 2012 22:01:33 -0700

1) check the jobtracker url to see how many maps/reducers have been launched
2) if you have a large dataset and wants to execute it fast, you
set mapred.min.split.size and mapred.max.split.size to an optimal value so
that more mappers will be launched and will finish
3) if you are doing joins, there are different ways to go according to the
data you have and size of data


it will be helpful if you can let us know your datasizes and query details

On Tue, May 8, 2012 at 10:07 AM, Bhavesh Shah <bhavesh25s...@gmail.com>wrote:

> Hello all,
> I have written a Hive JDBC code and created a JAR of it. I am running that
> JAR on 10 cluster.
> But the problem as I am using the 10 cluster still the performance is same
> as that on single cluster.
>
> What to do to improve the performance of Hive Jobs? Is there anything
> configuration setting to set before the submitting Hive Jobs to cluster?
> One more thing I want to know is that How can we come to know that is job
> running on all cluster?
>
> Please let me know if anyone knows about it?
>
> --
> Regards,
> Bhavesh Shah
>
>


-- 
Nitin Pawar

Re: Want to improve the performance for execution of Hive Jobs.

Reply via email to