Then in that case, will I be using group name tag in allocations file, like this inside each pool ?
< group name="ABC"> <maxRunningJobs>6</maxRunningJobs> </group> Thanks, Praveenesh On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote: > A solution would be to place your users into groups, and use > group.name identifier to be the poolnameproperty. Would this work for > you instead? > > On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <praveen...@gmail.com> > wrote: > > Also, with the above mentioned method, my problem is I am having one > > pool/user (thats obviously not a good way of configuring schedulers) > > How can I allocate multiple users to one pool in the xml properties, so > > that I don't have to care giving any options inside my codes. > > > > Thanks, > > Praveenesh > > > > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveen...@gmail.com > >wrote: > > > >> I am looking for the solution where we can do it permanently without > >> specify these things inside jobs. > >> I want to keep these things hidden from the end-user. > >> End-user would just write pig scripts and all the jobs submitted by the > >> particular user will get submit to their respective pools automatically. > >> > >> What I am doing write now is something like this > >> > >> <allocations> > >> <pool name="ABC"> > >> <minMaps>10</minMaps> > >> <minReduces>10</minReduces> > >> <maxMaps>192</maxMaps> > >> <maxReduces>96</maxReduces> > >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> > >> </pool> > >> <user name="ABC"> > >> > >> <maxRunningJobs>6</maxRunningJobs> > >> </user> > >> <userMaxJobsDefault>3</userMaxJobsDefault> > >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout> > >> > >> <pool name="XYZ"> > >> <minMaps>10</minMaps> > >> <minReduces>10</minReduces> > >> <maxMaps>192</maxMaps> > >> <maxReduces>96</maxReduces> > >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> > >> </pool> > >> <user name="XYZ"> > >> > >> <maxRunningJobs>6</maxRunningJobs> > >> </user> > >> <userMaxJobsDefault>3</userMaxJobsDefault> > >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout> > >> > >> </allocations> > >> > >> By doing this, I am able to see different pools per user, without > >> mentioning anything inside the jobs. > >> Automatically jobs are going to the respective pools. > >> > >> But what I wanted to know , is this the right method to do ? > >> > >> Thanks, > >> Praveenesh > >> > >> > >> > >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >>> Set the property in Pig with the 'set' command or other ways: > >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or > >>> http://pig.apache.org/docs/r0.9.1/start.html#properties > >>> > >>> As Srinivas covered earlier, pool allocation can be done per-user if > >>> you set the scheduler poolnameproperty to "user.name". Per group if > >>> you set the property to "group.name". > >>> > >>> Then you can provide per-poolname config overrides via the "pool" > >>> element config described in > >>> > >>> > http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29 > >>> > >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar < > praveen...@gmail.com> > >>> wrote: > >>> > I am running pig jobs, how can I specify on which pool, it should > run ? > >>> > Also do you mean, the pool allocation is done job wise, not user > wise ? > >>> > > >>> > > >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vas...@gmail.com > > > >>> wrote: > >>> > > >>> >> Praveenesh, > >>> >> > >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name > >>> while > >>> >> running the job. By default, mapred.faircheduler.poolnameproperty > set > >>> to > >>> >> user.name ( each job run by user is allocated to his named pool ) > and > >>> you > >>> >> can also change this property to group.name. > >>> >> > >>> >> Srinivas -- > >>> >> > >>> >> Also, you can set > >>> >> > >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar < > >>> praveen...@gmail.com > >>> >> >wrote: > >>> >> > >>> >> > Understanding Fair Schedulers better. > >>> >> > > >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. > Please > >>> >> > correct me. > >>> >> > > >>> >> > Suppose I have 2 pools in my fair-scheduler.xml > >>> >> > > >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max > >>> >> Reduce : > >>> >> > 50 > >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max > >>> Reduce : > >>> >> > 80 > >>> >> > > >>> >> > I have 5 users, who will be using these pools. How will I allocate > >>> >> specific > >>> >> > pools to specific users ? > >>> >> > > >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and > >>> >> user3,user4,user5 > >>> >> > to use "Admin users" > >>> >> > > >>> >> > In > >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html > >>> >> > they have mentioned allocations something like this. > >>> >> > > >>> >> > <?xml version="1.0"?> > >>> >> > <allocations> > >>> >> > <pool name="sample_pool"> > >>> >> > <minMaps>5</minMaps> > >>> >> > <minReduces>5</minReduces> > >>> >> > <maxMaps>25</maxMaps> > >>> >> > <maxReduces>25</maxReduces> > >>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout> > >>> >> > </pool> > >>> >> > <user name="sample_user"> > >>> >> > <maxRunningJobs>6</maxRunningJobs> > >>> >> > </user> > >>> >> > <userMaxJobsDefault>3</userMaxJobsDefault> > >>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout> > >>> >> > </allocations> > >>> >> > > >>> >> > I tried creating more pools, its happening, but how to allocate > >>> users to > >>> >> > use specific pools ? > >>> >> > > >>> >> > Thanks, > >>> >> > Praveenesh > >>> >> > > >>> >> > >>> > >>> > >>> > >>> -- > >>> Harsh J > >>> Customer Ops. Engineer, Cloudera > >>> > >> > >> > > > > -- > Harsh J > Customer Ops. Engineer, Cloudera >