[ 
https://issues.apache.org/jira/browse/HADOOP-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651568#action_12651568
 ] 

Hemanth Yamijala commented on HADOOP-4490:
------------------------------------------

I had some offline discussions with Arun and Sameer and here are some initial 
thoughts on approach. A lot of details still need to be flushed out, but I am 
posting this to get some early feedback.

We do want to run the daemons as non-privileged users, and yet go with a setuid 
based approach to run tasks as a regular user. One approach that was proposed 
to do this is as follows:
- We create a setuid executable, say a taskcontroller, that will be owned by 
root.
- This executable can take the following arguments - <user> <command> <command 
arguments>.
- <user> will be the job owner.
- <command> will be an action that needs to be performed, such as LAUNCH_JVM, 
KILL_TASK, etc.
- <command arguments> will depend on the command. For e.g. LAUNCH_JVM would 
have the arguments currently used to launch a JVM via the ShellCommandExecutor.
- The tasktracker will launch this executable with the appropriate command and 
arguments when needed.
- As the executable is a setuid exe, it will run as root, and will quickly drop 
privileges using setuid, to run as the user.
- Then the arguments will be used to execute the required action, for e.g. 
launching a VM or killing a task.
- Before dropping privileges, if needed, the executable could set up 
directories with appropriate ownership, etc.
- Naturally this would be platform specific. Hence, we can define a 
TaskController class that defines APIs to encapsulate these actions. For e.g., 
something like:
{code}
abstract class TaskController {

  abstract void launchTask();
  abstract void killTask(Task t);
  // etc...
}
{code}
- This could be extended by a LinuxTaskController, that converts the generic 
arguments into something that can be passed to executable - for e.g. maybe a 
process ID.
- One specific point is about the directory / file permissions. Sameer was of 
the opinion that the permissions should be quite strict, that is, world 
readable rights are not allowed. There are cases where the task as well as the 
daemon may need to access files. To handle this, one suggestion is to first set 
the permissions to the user, and then change the ownership to the daemon after 
the task is done.

The points above specify a broad approach. Please comment on whether this seems 
reasonable, reasonable in parts, or completely way off the mark. *smile*. Based 
on feedback, I would start implementing a prototype to flush out the details.

> Map and Reduce tasks should run as the user who submitted the job
> -----------------------------------------------------------------
>
>                 Key: HADOOP-4490
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4490
>             Project: Hadoop Core
>          Issue Type: Sub-task
>          Components: mapred, security
>            Reporter: Arun C Murthy
>            Assignee: Hemanth Yamijala
>             Fix For: 0.20.0
>
>
> Currently the TaskTracker spawns the map/reduce tasks, resulting in them 
> running as the user who started the TaskTracker.
> For security and accounting purposes the tasks should be run as the job-owner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to