1) check the jobtracker url to see how many maps/reducers have been launched 2) if you have a large dataset and wants to execute it fast, you set mapred.min.split.size and mapred.max.split.size to an optimal value so that more mappers will be launched and will finish 3) if you are doing joins, there are different ways to go according to the data you have and size of data
it will be helpful if you can let us know your datasizes and query details On Tue, May 8, 2012 at 10:07 AM, Bhavesh Shah <bhavesh25s...@gmail.com>wrote: > Hello all, > I have written a Hive JDBC code and created a JAR of it. I am running that > JAR on 10 cluster. > But the problem as I am using the 10 cluster still the performance is same > as that on single cluster. > > What to do to improve the performance of Hive Jobs? Is there anything > configuration setting to set before the submitting Hive Jobs to cluster? > One more thing I want to know is that How can we come to know that is job > running on all cluster? > > Please let me know if anyone knows about it? > > -- > Regards, > Bhavesh Shah > > -- Nitin Pawar