Re: Fair scheduler.

Patai Sangbutsarakum Tue, 16 Oct 2012 16:52:32 -0700

Thanks everyone, Seem like i hit the dead end.
It's kind of funny when i read that jira; run it 4 time and everything
will work.. where that magic number from..lol


respects

On Tue, Oct 16, 2012 at 4:12 PM, Arpit Gupta <[email protected]> wrote:
> https://issues.apache.org/jira/browse/MAPREDUCE-4398
>
> is the bug that Robin is referring to.
>
> --
> Arpit Gupta
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Oct 16, 2012, at 3:51 PM, "Goldstone, Robin J." <[email protected]>
> wrote:
>
> This is similar to issues I ran into with permissions/ownership of
> mapred.system.dir when using the fair scheduler.  We are instructed to set
> the ownership of mapred.system.dir to mapred:hadoop and then when the job
> tracker starts up (running as user mapred) it explicitly sets the
> permissions on this directory to 700.  Meanwhile when I go to run a job as
> a regular user, it is trying to write stuff into mapred.system.dir but it
> can't due to the ownership/permissions that have been established.
>
> Per discussion with Arpit Gupta, this is a bug with the fair scheduler and
> it appears from your experience that there are similar issues with
> hadoop.tmp.dir.  The whole idea of the fair scheduler is to run jobs under
> the user's identity rather than as user mapred.  This is good from a
> security perspective yet it seems no one bothered to account for this in
> terms of the permissions that need to be set in the various directories to
> enable this.
>
> Until this is sorted out by the Hadoop developers, I've put my attempts to
> use the fair scheduler on holdŠ
>
> Regards,
> Robin Goldstone, LLNL
>
> On 10/16/12 3:32 PM, "Patai Sangbutsarakum" <[email protected]>
> wrote:
>
> Hi Harsh,
> Thanks for breaking it down clearly. I would say i am successful 98%
> from the instruction.
> The 2% is about hadoop.tmp.dir
>
> let's say i have 2 users
> userA is a user that start hdfs and mapred
> userB is a regular user
>
> if i use default value of  hadoop.tmp.dir
> /tmp/hadoop-${user.name}
> I can submit job as usersA but not by usersB
> ser=userB, access=WRITE, inode="/tmp/hadoop-userA/mapred/staging"
> :userA:supergroup:drwxr-xr-x
>
> i googled around; someone recommended to change hadoop.tmp.dir to
> /tmp/hadoop.
> This way it is almost a yay way; the thing is
>
> if I submit as userA it will create /tmp/hadoop in local machine which
> ownership will be userA.userA,
> and once I tried to submit job from the same machine as userB I will
> get  "Error creating temp dir in hadoop.tmp.dir /tmp/hadoop due to
> Permission denied"
> (as because /tmp/hadoop is own by userA.userA). vise versa if I delete
> /tmp/hadoop and let the directory be created by userB, userA will not
> be able to submit job.
>
> Which is the right approach i should work with?
> Please suggest
>
> Patai
>
>
> On Mon, Oct 15, 2012 at 3:18 PM, Harsh J <[email protected]> wrote:
>
> Hi Patai,
>
> Reply inline.
>
> On Tue, Oct 16, 2012 at 2:57 AM, Patai Sangbutsarakum
> <[email protected]> wrote:
>
> Thanks for input,
>
> I am reading the document; i forget to mention that i am on cdh3u4.
>
>
> That version should have the support for all of this.
>
> If you point your poolname property to mapred.job.queue.name, then you
> can leverage the Per-Queue ACLs
>
>
> Is that mean if i plan to 3 pools of fair scheduler, i have to
> configure 3 queues of capacity scheduler. in order to have each pool
> can leverage Per-Queue ACL of each queue.?
>
>
> Queues are not hard-tied into CapacityScheduler. You can have generic
> queues in MR. And FairScheduler can bind its Pool concept into the
> Queue configuration.
>
> All you need to do is the following:
>
> 1. Map FairScheduler pool name to reuse queue names itself:
>
> mapred.fairscheduler.poolnameproperty set to 'mapred.job.queue.name'
>
> 2. Define your required queues:
>
> mapred.job.queues set to "default,foo,bar" for example, for 3 queues:
> default, foo and bar.
>
> 3. Define Submit ACLs for each Queue:
>
> mapred.queue.default.acl-submit-job set to "patai,foobar users,adm"
> (usernames groupnames)
>
> mapred.queue.foo.acl-submit-job set to "spam eggs"
>
> Likewise for remaining queues, as you need itŠ
>
> 4. Enable ACLs and restart JT.
>
> mapred.acls.enabled set to "true"
>
> 5. Users then use the right API to set queue names before submitting
> jobs, or use -Dmapred.job.queue.name=value via CLI (if using Tool):
>
> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobConf
> .html#setQueueName(java.lang.String)
>
> 6. Done.
>
> Let us know if this works!
>
> --
> Harsh J
>
>
>

Re: Fair scheduler.

Reply via email to