I ran into the same issue as Jason had (i.e. hadoop daemon is run as one
user, but the job is submitted by another user. please look for
details posted by Jason below) and saw Jason's email in the archive, but
no body replied. Is this the problem already been posted, answered
and I didn't find it, or this usage pattern is so unusual? otherwise,
can somebody help out here? thanks lot,
Eric
jason gessner wrote:
hi all.
We have a small cluster set up now and have hit an interesting issue
when trying to get people running jobs on it.
My user kicked off the cluster. Other users with a copy of the same
conf directory can successfully put things on the cluster and their
username shows up in the path with their username owning the new
files/folders. When a job is submitted, however, the job is
attempting to be written to my directory and failing because of
permissions.
example.
Users jason and allister. Jason kicked off the cluster.
jason puts data from his account, it goes to /user/hadoop-jason/,
allister puts data from his account and it goes to
/user/hadoop-allister/. Both pieces of data have the expected
permissions.
when a job runs, the job*.xml file goes to /tmp/hadoop-jason
regardless of which account kicked the job off. /tmp/hadoop-allister
does exist and does have the proper ownership (allister), but hadoop
tries to write allister's job.xml file to /tmp/hadoop-jason
i hope the example is clear enough.
My questions are:
1) should we simply set appropriate high level sticky permissions on
the root of the hadoop paths?
2) should we run all jobs from a single account?
how have people dealt with this in their environments?
Thanks!
-jason