[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2017-11-17 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257394#comment-16257394
 ] 

Miklos Szegedi commented on YARN-5673:
--

[~eyang], thank you for the explanation. Indeed, these might need root 
privileges.


> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2017-11-17 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257315#comment-16257315
 ] 

Eric Yang commented on YARN-5673:
-

Hi [~miklos.szeg...@cloudera.com] 
4) For clean up container, it *might* require root privileges.  For example, if 
a docker container is stuck, and we need to clean up forcefully using {{docker 
rm-f}} as root user or a container runs multi-user code and generate local temp 
files that can only be cleaned up using privileged user access.  This might be 
the preferred method for clean up.

5) Some Linux (i.e. SuSE) prevents user to identify which PID is attached to 
which port unless the checking is using root privileges.  It is also possible a 
PID file is written and readable by root user only.  This is the reason that we 
*might* want to give root access to check container process tree and ports 
allocation.

{quote}
I absolutely agree. However, the tool already does this, does not it?
{quote}

Agree, rewrite is not required.  Enhancement and bug fix might be good enough 
to make this better.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2017-11-17 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257268#comment-16257268
 ] 

Miklos Szegedi commented on YARN-5673:
--

[~eyang], Thank you for your comment. I have a few questions.
Could you elaborate, why does clean up container (4.) require to be written in 
C? Similarly I do not think step 5. mentioned above requires it either. I agree 
that only steps that need root privileges or system calls need to be here 
everything else could go to node manager or a Java process run as root. I even 
have some concerns, that the docker code needs to reside in container executor. 
See YARN-7506 for reference, to discuss.
bq. We should try to contain root power by minimize the workflow and type that 
we support.
I absolutely agree. However, the tool already does this, does not it?

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2017-11-16 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256151#comment-16256151
 ] 

Eric Yang commented on YARN-5673:
-

It would be safer for container-executor to follow specific workflow, such as:

# validate input
# initialize container
# run code in container
# clean up container
# check container

Each workflow can support different type of containers, such as docker, yarn 
container, etc.  Each type is hard wired to specific Linux cli operations.  
This safe guard ourselves from abusing root power.  Systemd is a great example 
of getting too much root power.  We should try to contain root power by 
minimize the workflow and type that we support.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-12-07 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729756#comment-15729756
 ] 

Allen Wittenauer commented on YARN-5673:


FWIW, a lot of the issues raised here are why I recommended moving to dynamic 
loading.  No extra executables, may potentially cut down on launch time, no 
need to worry about unused features being holes because you can remove the code 
from the execution path completely, etc, etc.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-12-07 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729736#comment-15729736
 ] 

Miklos Szegedi commented on YARN-5673:
--

Thank you [~vvasudev] for the quick and detailed response! I really appreciate 
it.
{quote}
All of these binaries will require the setuid bit to be a set which means 
administrators will have to set permissions and manage 4 binaries. We also have 
to worry about 4 binaries that can have privilege escalation as opposed to one 
- any hot fixes for example will require all 4 binaries to be updated as 
opposed to just one. Interestingly you feel that administrator overhead of 
managing 4 binaries is worth it whereas some folks would prefer it the other 
way round . Do other folks feel that the multiple binaries approach is the way 
to go?
{quote}
Yes, I absolutely agree that this is a preference question. I am not sure about 
the ratios though. In terms of overhead, the administrator has to enable the 
modules in the configuration anyways. What I thought is that it is easier to 
set the permissions using familiar Unix tools rather than looking up the 
configuration files, reading the documentation about them and enabling the 
required modules with the right format. I have seen in the past issues with too 
many spaces for example.
However, please take some time to answer this one question. Let's assume, we 
were about to design /usr/bin/at, /usr/bin/sudo and /user/bin/passwd. They have 
about the same difference as container launching and mounting cgroups. Would 
you design them as separate tools, or as a single binary that loads them 
separately as modules based on a configuration file and command line options?
{quote}
We also have to worry about 4 binaries that can have privilege escalation as 
opposed to one
{quote}
I think the risk of privilege escalation is proportional to the amount of code 
rather than the amount of binaries, so it is about the same. On the other hand 
packing multiple functions into the same memory space may increase the sum of 
the individual risks in case of native code.
{quote}
any hot fixes for example will require all 4 binaries to be updated as opposed 
to just one. 
{quote}
(I assume that the code will be super stable :-) ...)
This was actually one question that I raised. What is the common code among 
features separated into modules? Only if common functionality is broken, it 
needs to be patched. I think this would be limited to auditing, logging, and 
maybe some filesystem operations that can be linked to the tools.
{quote}
Fair point. The idea here is that -
(1) Administrators will not add arbitrary modules to the module list.
(2) The posix-container-executor will give up all privileges before loading the 
modules which don't require administrator privileges
(3) Give administrators an option to turn off modules that require 
administrator privileges.
Would these help mitigate your concerns? The issue with the current setup is 
that there is no clean way to enable/disable functionality that administrators 
do not want enabled on their cluster.
{quote}
I agree, this is an issue in the current setup and yes, I think these are the 
right design decisions. Just as a side note, I prefer privileged modules be 
disabled by default for security and supportability reasons.
{quote}
Do you have some scenarios where container launch time has been an issue? The 
security aspects of a long running process versus one which is invoked on 
demand are different as well.
{quote}
I just wanted to discuss this design option early before much coding has 
started. If we want to use Yarn not just for long batch processing but for lots 
of quick requests in the future, launch time is an issue. I thought I raise 
pipe as an other option communicating the commands together with command line, 
file, and environment variables.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the 

[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-12-06 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727953#comment-15727953
 ] 

Varun Vasudev commented on YARN-5673:
-

Thanks for the feedback [~miklos.szeg...@cloudera.com]!

{quote}
What is the common functionality that all modules need and why do not we simply 
split them into separate executables instead of loading them as modules to a 
core container-executor process?
linux-container-executor
docker-container-executor
mount-cgroups
tc-executor

2. I also have a concern of maintenance. File system privileges setup is what 
system administrators are most familiar with. We could let them use that 
instead of modifying proprietary configuration files, how to locate load 
modules into a linux-container-executor binary. Any common functionality like 
auditing, logging, setup checking can be statically linked to each executable. 
It also has the advantage of setting suid bits separately. An example: 
/usr/bin/sudo and /usr/bin/passwd tell in their names what they do and what you 
get if you set suid permission on them. An administrator would only set it on 
the executables that are needed. Setting suid on the posix-container-executor 
on the other hand means that it is allowed to load modules in a controlled but 
advanced way. Separate executables made the life of administrators way easier I 
think. Each of these binaries can have its own configuration file just like the 
modules you proposed.
{quote}

All of these binaries will require the setuid bit to be a set which means 
administrators will have to set permissions and manage 4 binaries. We also have 
to worry about 4 binaries that can have privilege escalation as opposed to one 
- any hot fixes for example will require all 4 binaries to be updated as 
opposed to just one. Interestingly you feel that administrator overhead of 
managing 4 binaries is worth it whereas some folks would prefer it the other 
way round :). Do other folks feel that the multiple binaries approach is the 
way to go?

{quote}
1. I have a concern of native modules potentially loaded into the same process. 
Even if communication between modules is not allowed this is a native binary, 
where all native code will have access to the whole memory. Just like the 
current executor, when more features are running in the same process with admin 
privileges, a faulty or malicious module may cause security issues even loading 
and accessing others. Even if we can add protection, the protection code would 
add more complexity.
{quote}

Fair point. The idea here is that -
(1) Administrators will not add arbitrary modules to the module list.
(2) The posix-container-executor will give up all privileges before loading the 
modules which don't require administrator privileges
(3) Give administrators an option to turn off modules that require 
administrator privileges.
Would these help mitigate your concerns? The issue with the current setup is 
that there is no clean way to enable/disable functionality that administrators 
do not want enabled on their cluster.

{quote}
A. One more separate issue that I wanted to ask your opinion about is the time 
launching a container. First the container executor is executed, then a shell 
script that runs the actual container like Java. Would not it be faster to 
launch container executor just once and communicate launch commands through the 
standard pipe or a named pipe and keep it running as long as the node manager 
is running?
{quote}
It probably would be but container launch time hasn’t been something people 
have complained about. Do you have some scenarios where container launch time 
has been an issue? The security aspects of a long running process versus one 
which is invoked on demand are different as well.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on 

[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-12-06 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726489#comment-15726489
 ] 

Miklos Szegedi commented on YARN-5673:
--

Thank you [~vvasudev] for the design proposal! I like the design approach.

I was missing some arguments about another approach, and why it was not 
considered.
What is the common functionality that all modules need and why do not we simply 
split them into separate executables instead of loading them as modules to a 
core container-executor process?
linux-container-executor
docker-container-executor
mount-cgroups
tc-executor
...

1. I have a concern of native modules potentially loaded into the same process. 
Even if communication between modules is not allowed this is a native binary, 
where all native code will have access to the whole memory. Just like the 
current executor, when more features are running in the same process with admin 
privileges, a faulty or malicious module may cause security issues even loading 
and accessing others. Even if we can add protection, the protection code would 
add more complexity.

2. I also have a concern of maintenance. File system privileges setup is what 
system administrators are most familiar with. We could let them use that 
instead of modifying proprietary configuration files, how to locate load 
modules into a linux-container-executor binary. Any common functionality like 
auditing, logging, setup checking can be statically linked to each executable. 
It also has the advantage of setting suid bits separately. An example: 
/usr/bin/sudo and /usr/bin/passwd tell in their names what they do and what you 
get if you set suid permission on them. An administrator would only set it on 
the executables that are needed. Setting suid on the posix-container-executor 
on the other hand means that it is allowed to load modules in a controlled but 
advanced way.  Separate executables made the life of administrators way easier 
I think. Each of these binaries can have its own configuration file just like 
the modules you proposed.

3. Separate executables may help a little with debugging, too.



A. One more separate issue that I wanted to ask your opinion about is the time 
launching a container. First the container executor is executed, then a shell 
script that runs the actual container like Java. Would not it be faster to 
launch container executor just once and communicate launch commands through the 
standard pipe or a named pipe and keep it running as long as the node manager 
is running?

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> 

[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-12-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724661#comment-15724661
 ] 

Varun Vasudev commented on YARN-5673:
-

I've created branch YARN-5673 for this work.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-10-13 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572625#comment-15572625
 ] 

Sidharta Seethana commented on YARN-5673:
-

[~templedf], I think this is restricted to the container-executor binary, 
specifically, not ContainerExecutor. You are right, though - about additional 
functionality being pushed into lower layers and the use ENV variables. This 
former was to ensure backward compatibility at the ContainerExecutor interface 
layer. The latter was to ensure a) no changes to protocols b) allow for 
existing apps to use the new functionality without changes (e.g MR, Spark). We 
should discuss these in a new/different JIRA, though. 

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

2016-09-26 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523561#comment-15523561
 ] 

Daniel Templeton commented on YARN-5673:


+1

The current state of things has pushed the majority of the functionality under 
{{LinuxContainerExecutor}}, leaving the {{ContainerExecutor}} interface rather 
lame.  I'm also unhappy with the notion of passing all of the Docker 
information as environment variables.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> --
>
> Key: YARN-5673
> URL: https://issues.apache.org/jira/browse/YARN-5673
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org