Hi From the jobtracker web UI you can get the total number of map and reduce slots. Also from the wen UI itself you can get the num of running map/reduce tasks. Second value subtracted from first would give you the available slots.
Fair scheduler is a property of map reduce and not of hive. It is primarily used to control the number of slots used by each user/pool in a cluster. You can read more @ http://hadoop.apache.org/docs/mapreduce/r0.20.2/fair_scheduler.html Regards Bejoy KS Sent from handheld, please excuse typos. -----Original Message----- From: Chunky Gupta <chunky.gu...@vizury.com> Date: Mon, 22 Oct 2012 18:22:03 To: <user@hive.apache.org> Reply-To: user@hive.apache.org Cc: <bejoy...@yahoo.com>; <decho...@gmail.com> Subject: Re: How to run multiple Hive queries in parallel Hi Bejoy and Bertrand Thanks for quick reply. I think tasks slots are not available in my cluster because I have only 4 slave machines. Actually I am beginner to HIVE. So, if you can let me know how I can check if time slots are available or not. I have different users credentials to log in into my name node machine, but I don't have much idea about fair scheduler. In case time slots are not available and are exhausted , then if you can please point me to some publicly available fair scheduler which I can integrate with HIVE to solve my problem. Thank You, Chunky. On Mon, Oct 22, 2012 at 5:52 PM, Bertrand Dechoux <decho...@gmail.com>wrote: > Bejoy is right. I just want to say explicitly that the scheduler > configuration is something which is orthogonal to the use of Hive. (ie same > problem with Pig or standard MapReduce jobs). > > Regards > > Bertrand > > PS : There is also the capacity scheduler. > > > On Mon, Oct 22, 2012 at 2:18 PM, Bejoy KS <bejoy...@yahoo.com> wrote: > >> ** >> Hi >> >> Is your hive queries in waiting mode even though there are task slots >> available on your cluster? >> >> If task slots are getting exhausted and you need parallelism here, then >> you may need to look at some approaches of using fair scheduler and >> different user accounts for each user so that each user gets his fair share >> of task slots. >> >> >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> ------------------------------ >> *From: * Chunky Gupta <chunky.gu...@vizury.com> >> *Date: *Mon, 22 Oct 2012 17:27:45 +0530 >> *To: *<user@hive.apache.org> >> *ReplyTo: * user@hive.apache.org >> *Subject: *How to run multiple Hive queries in parallel >> >> Hi, >> >> I have one name node machine and under which there are 4 slaves machines >> to run the job. >> >> The way users run queries is >> - They ssh into the name node machine >> - They initiate hive and submit their queries >> >> Currently multiple users log in with the same credentials and submit >> queries >> >> Whenever 2 or more users try to run queries at a same time from different >> hive console , it runs only one query at a time and when that query is >> finished then only next query starts executing and so on. >> >> In this scenario if there is a large query which is submitted earlier >> then all the other queries have to wait for that query to complete. >> >> I want to run multiple query at the same time. Is there any way or any >> configuration parameter to do the same ? >> >> PS: The data is in S3 and running HIVE on AWS EMR infrastructure in >> interactive mode. >> >> Thank You, >> Chunky. >> >> > > > -- > Bertrand Dechoux >