[jira] [Commented] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981784#comment-13981784
 ] 

Hadoop QA commented on YARN-1885:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642040/YARN-1885.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3634//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3634//console

This message is automatically generated.

> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
> Attachments: YARN-1885.patch, YARN-1885.patch
>
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1696) Document RM HA

2014-04-25 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-1696:
---

Attachment: YARN-1696-3.patch

I hit upon this JIRA while I was trying to setup RM HA and updated the patch 
based on my try. I am attaching the updated patch including fixes such as
- added link to site.xml,
- changed file name,
- removed parts duplicated with "ResourceManager Restart" page,
- changed the order of some subsections along with the removal of duplication,
- added client configurations to the table,
- added sample configurations
- added description about CLI.

Sorry for breaking in, [~kkambatl].

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696-3.patch, YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1885:
-

Attachment: YARN-1885.patch

Uploaded new patch solved NPE in UT

> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
> Attachments: YARN-1885.patch, YARN-1885.patch
>
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-04-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1987:


 Summary: Wrapper for leveldb DBIterator to aid in handling 
database exceptions
 Key: YARN-1987
 URL: https://issues.apache.org/jira/browse/YARN-1987
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jason Lowe
Assignee: Jason Lowe


Per discussions in YARN-1984 and MAPREDUCE-5652, it would be nice to have a 
utility wrapper around leveldb's DBIterator to translate the raw 
RuntimeExceptions it can throw into DBExceptions to make it easier to handle 
database errors while iterating.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981723#comment-13981723
 ] 

Hadoop QA commented on YARN-1885:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642016/YARN-1885.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3633//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3633//console

This message is automatically generated.

> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
> Attachments: YARN-1885.patch
>
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-483) Improve documentation on log aggregation in yarn-default.xml

2014-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981650#comment-13981650
 ] 

Hudson commented on YARN-483:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #5575 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5575/])
YARN-483. Improve documentation on log aggregation in yarn-default.xml (Akira 
Ajisaka via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1590150)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> Improve documentation on log aggregation in yarn-default.xml
> 
>
> Key: YARN-483
> URL: https://issues.apache.org/jira/browse/YARN-483
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Akira AJISAKA
> Fix For: 2.5.0
>
> Attachments: YARN-483.2.patch, YARN-483.patch
>
>
> The current documentation for log aggregation is 
> {code}
>   
> Whether to enable log aggregation
> yarn.log-aggregation-enable
> false
>   
> {code}
> This could be improved to explain what enabling log aggregation does.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2014-04-25 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981632#comment-13981632
 ] 

Abin Shahab commented on YARN-1983:
---

Right, there maybe other, better alternatives. What kinds of extensibility are 
we supporting here? What kind of containers will YARN be able to support? What 
kind of configurations would these need?

>From the answers to these questions we can start talking about the 
>abstractions needed to let Yarn launch all kinds of containers.

> Support heterogeneous container types at runtime on YARN
> 
>
> Key: YARN-1983
> URL: https://issues.apache.org/jira/browse/YARN-1983
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>
> Different container types (default, LXC, docker, VM box, etc.) have different 
> semantics on isolation of security, namespace/env, performance, etc.
> Per discussions in YARN-1964, we have some good thoughts on supporting 
> different types of containers running on YARN and specified by application at 
> runtime which largely enhance YARN's flexibility to meet heterogenous app's 
> requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1885:
-

Attachment: YARN-1885.patch

Attached a patch implemented the method in my last comment, I would appreciate 
some feedbacks for this. Thanks!

> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
> Attachments: YARN-1885.patch
>
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981630#comment-13981630
 ] 

Wangda Tan commented on YARN-1885:
--

*This is caused by application completed in RM, but NM cannot recv application 
clean-up msg after RM restarted. This will cause a serials of problems, include 
but not limited,*
* Log aggregation not works sometimes,
* Application shown to “RUNNING” in NM’s web page, but it’s already terminated 
in RM

*We can reproduce this bug by following way, (in a recovery-enabled cluster)*
1) Submit application (has some deliberate errors will cause AM failure) to RM
2) Before application’s state transferred to FAILED, restart RM
3) After RM restarted / NM register, app state will become failed in RM, but it 
still shown running in NM side

*There’re multiple places will cause this problem*
1) Race condition in ResourceTrackerService.registerNodeManager
Handle container status logic,
{code}
if (!request.getContainerStatuses().isEmpty()) {
  LOG.info("received container statuses on node manager register :"
  + request.getContainerStatuses());
  for (ContainerStatus containerStatus : request.getContainerStatuses()) {
handleContainerStatus(containerStatus);
  }
}
{code}
Happened before create RMNodeImplInstance
{code}
RMNode rmNode = new RMNodeImpl(nodeId, rmContext, host, cmPort, httpPort,
resolve(host), ResourceOption.newInstance(capability, 
RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT),
nodeManagerVersion);

RMNode oldNode = this.rmContext.getRMNodes().putIfAbsent(nodeId, rmNode);
if (oldNode == null) {
  this.rmContext.getDispatcher().getEventHandler().handle(
  new RMNodeEvent(nodeId, RMNodeEventType.STARTED));
} else {
  LOG.info("Reconnect from the node at: " + host);
  this.nmLivelinessMonitor.unregister(nodeId);
  this.rmContext.getDispatcher().getEventHandler().handle(
  new RMNodeReconnectEvent(nodeId, rmNode));
}
{code}
So the RMAppImpl.FinalTransition will finish the application, but cannot notify 
corresponding RMNode.
2) RMAppAttempt cannot get full ranNodes after restart (RMAppAttempt will set 
to LAUNCHED state after restart)

*Proposal*
1) Add full running applications list while NM registering with RM
2) ResourceTrackerService (RTS for short) will,
* If RMApp not in final state, add RMNode to RMAppAttempt’s ranNodes.
* If RMApp already in final state, send RMNodeCleanAppEvent to RMNode

3) Address race condition described above


> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1885) yarn logs command does not provide the application logs for some applications

2014-04-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-1885:


Assignee: Wangda Tan

> yarn logs command does not provide the application logs for some applications
> -
>
> Key: YARN-1885
> URL: https://issues.apache.org/jira/browse/YARN-1885
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Wangda Tan
>
> During our HA testing we have seen cases where yarn application logs are not 
> available through the cli but i can look at AM logs through the UI. RM was 
> also being restarted in the background as the application was running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2014-04-25 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981612#comment-13981612
 ] 

Junping Du commented on YARN-1983:
--

Thanks [~ashahab] for volunteering on this. I think we could discuss more 
options before quickly jumping on the effort of extending ContainerRequest in 
case other people may have some better ideas. Thoughts?

> Support heterogeneous container types at runtime on YARN
> 
>
> Key: YARN-1983
> URL: https://issues.apache.org/jira/browse/YARN-1983
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>
> Different container types (default, LXC, docker, VM box, etc.) have different 
> semantics on isolation of security, namespace/env, performance, etc.
> Per discussions in YARN-1964, we have some good thoughts on supporting 
> different types of containers running on YARN and specified by application at 
> runtime which largely enhance YARN's flexibility to meet heterogenous app's 
> requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-483) Improve documentation on log aggregation in yarn-default.xml

2014-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981596#comment-13981596
 ] 

Sandy Ryza commented on YARN-483:
-

+1

> Improve documentation on log aggregation in yarn-default.xml
> 
>
> Key: YARN-483
> URL: https://issues.apache.org/jira/browse/YARN-483
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Akira AJISAKA
> Attachments: YARN-483.2.patch, YARN-483.patch
>
>
> The current documentation for log aggregation is 
> {code}
>   
> Whether to enable log aggregation
> yarn.log-aggregation-enable
> false
>   
> {code}
> This could be improved to explain what enabling log aggregation does.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1864) Fair Scheduler Dynamic Hierarchical User Queues

2014-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981589#comment-13981589
 ] 

Sandy Ryza commented on YARN-1864:
--

I think it's better to leave out case 4.  The right behavior on it is fuzzy, 
and things are simpler if the results returned by QueuePlacementPolicy are only 
a function of the configuration. 

No other comments than that at the moment.

> Fair Scheduler Dynamic Hierarchical User Queues
> ---
>
> Key: YARN-1864
> URL: https://issues.apache.org/jira/browse/YARN-1864
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Reporter: Ashwin Shankar
>  Labels: scheduler
> Attachments: YARN-1864-v1.txt, YARN-1864-v2.txt, YARN-1864-v3.txt
>
>
> In Fair Scheduler, we want to be able to create user queues under any parent 
> queue in the hierarchy. For eg. Say user1 submits a job to a parent queue 
> called root.allUserQueues, we want be able to create a new queue called 
> root.allUserQueues.user1 and run user1's job in it.Any further jobs submitted 
> by this user to root.allUserQueues will be run in this newly created 
> root.allUserQueues.user1.
> This is very similar to the 'user-as-default' feature in Fair Scheduler which 
> creates user queues under root queue. But we want the ability to create user 
> queues under ANY parent queue.
> Why do we want this ?
> 1. Preemption : these dynamically created user queues can preempt each other 
> if its fair share is not met. So there is fairness among users.
> User queues can also preempt other non-user leaf queue as well if below fair 
> share.
> 2. Allocation to user queues : we want all the user queries(adhoc) to consume 
> only a fraction of resources in the shared cluster. By creating this 
> feature,we could do that by giving a fair share to the parent user queue 
> which is then redistributed to all the dynamically created user queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2014-04-25 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981509#comment-13981509
 ] 

Abin Shahab commented on YARN-1983:
---

After YARN-1964, I can work on extending the containerRequest so that it can 
accommodate these changes at run time.
Abin

> Support heterogeneous container types at runtime on YARN
> 
>
> Key: YARN-1983
> URL: https://issues.apache.org/jira/browse/YARN-1983
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>
> Different container types (default, LXC, docker, VM box, etc.) have different 
> semantics on isolation of security, namespace/env, performance, etc.
> Per discussions in YARN-1964, we have some good thoughts on supporting 
> different types of containers running on YARN and specified by application at 
> runtime which largely enhance YARN's flexibility to meet heterogenous app's 
> requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1986) After upgrade from 2.2.0 to 2.4.0, NPE on first job start.

2014-04-25 Thread Jon Bringhurst (JIRA)
Jon Bringhurst created YARN-1986:


 Summary: After upgrade from 2.2.0 to 2.4.0, NPE on first job start.
 Key: YARN-1986
 URL: https://issues.apache.org/jira/browse/YARN-1986
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Jon Bringhurst


After upgrade from 2.2.0 to 2.4.0, NPE on first job start.

After RM was restarted, the job runs without a problem.

{noformat}
19:11:13,441 FATAL ResourceManager:600 - Error in handling event type 
NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:462)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:714)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:743)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:104)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
at java.lang.Thread.run(Thread.java:744)
19:11:13,443  INFO ResourceManager:604 - Exiting, bbye..
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981497#comment-13981497
 ] 

Jason Lowe commented on YARN-1985:
--

The exit status should be whatever exit status came from the process when it 
exited.  When a container is killed the NM first sends a SIGTERM and then a 
short time later (250 msec IIRC) it sends SIGKILL.  A process that exits with a 
status code of 0 despite receiving SIGTERM could explain the behavior.  It 
could also happen if the container exited on its own after the NM logged that 
it was going to kill it but before it actually tried to kill it.

Looking at the DefaultContainerExecutor code it certainly appears that the 
process being killed must have returned an exit code of zero unless you are 
seeing logs such as "Exit code from container 
container_1398429077682_0006_02_05 is : " in the logs.  I'm not sure 
exactly what's being run in the container, but checking if that will return an 
exit code of 0 despite being killed by SIGTERM seems like the next best place 
to look.

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>Priority: Minor
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981489#comment-13981489
 ] 

Hadoop QA commented on YARN-1964:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12641983/yarn-1964-docker.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3632//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3632//console

This message is automatically generated.

> Create Docker analog of the LinuxContainerExecutor in YARN
> --
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Arun C Murthy
>Assignee: Abin Shahab
> Attachments: yarn-1964-branch-2.2.0-docker.patch, 
> yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch, 
> yarn-1964-docker.patch
>
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Zhurakousky updated YARN-1985:
---

Priority: Minor  (was: Major)

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>Priority: Minor
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981474#comment-13981474
 ] 

Oleg Zhurakousky commented on YARN-1985:


Actually a bit of a good news. The other two containers didn't start because 
one of my nodes had its date/time messed up resulting 
{code}
org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start 
container. 
This token is expired. current time is 1398449721411 found 1398448925681
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
. . . 
{code} 
So handling 'onStartContainerError' event would do.
So this makes it much less of an issue and I can work around it (actually 
already did), but the fact that _ExitStatus_ for the containers that did start 
was 0 is still a problem.
Downgrading it to minor

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>Priority: Minor
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981445#comment-13981445
 ] 

Hadoop QA commented on YARN-1063:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641970/YARN-1063.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1279 javac 
compiler warnings (more than the trunk's current 1278 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3630//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3630//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3630//console

This message is automatically generated.

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.3.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adju

[jira] [Updated] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-04-25 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated YARN-1964:
--

Attachment: yarn-1964-docker.patch

Trunk-patch with test passing.

> Create Docker analog of the LinuxContainerExecutor in YARN
> --
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Arun C Murthy
>Assignee: Abin Shahab
> Attachments: yarn-1964-branch-2.2.0-docker.patch, 
> yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch, 
> yarn-1964-docker.patch
>
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1681) When "banned.users" is not set in LCE's container-executor.cfg, submit job with user in DEFAULT_BANNED_USERS will receive unclear error message

2014-04-25 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981421#comment-13981421
 ] 

Junping Du commented on YARN-1681:
--

Nice catch, [~wzc1989]! Patch looks good to me. However, would you like to add 
a unit test in test-container-executor.c to cover this case?

> When "banned.users" is not set in LCE's container-executor.cfg, submit job 
> with user in DEFAULT_BANNED_USERS will receive unclear error message
> ---
>
> Key: YARN-1681
> URL: https://issues.apache.org/jira/browse/YARN-1681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Zhichun Wu
>Priority: Minor
>  Labels: container, usability
> Attachments: YARN-1681.patch
>
>
> When using LCE in a secure setup, if "banned.users" is not set in 
> container-executor.cfg, submit job with user in DEFAULT_BANNED_USERS 
> ("mapred", "hdfs", "bin", 0)  will receive unclear error message.
> for example, if we use hdfs to submit a mr job, we may see the following the 
> yarn app overview page:
> {code}
> appattempt_1391353981633_0003_02 exited with exitCode: -1000 due to: 
> Application application_1391353981633_0003 initialization failed 
> (exitCode=139) with output: 
> {code}
> while the prefer error message may look like:
> {code}
> appattempt_1391353981633_0003_02 exited with exitCode: -1000 due to: 
> Application application_1391353981633_0003 initialization failed 
> (exitCode=139) with output: Requested user hdfs is banned 
> {code}
> just a minor bug and I would like to start contributing to hadoop-common with 
> it:)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981409#comment-13981409
 ] 

Oleg Zhurakousky commented on YARN-1985:


My theory is confirmed. After fixing my bug application finished with SUCCEEDED 
status which is obviously wrong.

What makes it even a bigger problem IMHO is that it seems like YARN decided not 
to even attempt to start the other two containers which creates an interesting 
dilemma. 
How do you monitor overall application completion when:
4 Containers are allocated
2 Started and killed
2 Didn't start.
Sure I can use AtomicInteger and increment/decrement it. But if I don't have 
any guarantee around container start attempts I may be exiting too soon. For 
example in my case such counter would go from 0 to 2 and then back to 0 
signifying completion and as I am exiting YARN may decide to start another 
container

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981401#comment-13981401
 ] 

Hadoop QA commented on YARN-1964:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12641977/yarn-1964-branch-2.2.0-docker.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3631//console

This message is automatically generated.

> Create Docker analog of the LinuxContainerExecutor in YARN
> --
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Arun C Murthy
>Assignee: Abin Shahab
> Attachments: yarn-1964-branch-2.2.0-docker.patch, 
> yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch
>
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-04-25 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated YARN-1964:
--

Attachment: yarn-1964-branch-2.2.0-docker.patch

This should pass on the branch.

> Create Docker analog of the LinuxContainerExecutor in YARN
> --
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Arun C Murthy
>Assignee: Abin Shahab
> Attachments: yarn-1964-branch-2.2.0-docker.patch, 
> yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch
>
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1063:
---

Attachment: YARN-1063.3.patch

Now with more whitespace

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.3.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981332#comment-13981332
 ] 

Oleg Zhurakousky commented on YARN-1985:


Also you are saying only 3 states. What about all those LOCALIZING, LOCALZED, 
KILLING, ACQUIRED  etc...
Anyway, here is the NM log fragment:
{code}
2014-04-25 12:47:45,230 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Removed ProcessTree with root 12510
2014-04-25 12:47:45,230 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1398429077682_0006_02_03 transitioned from RUNNING to 
KILLING
2014-04-25 12:47:45,230 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Cleaning up container container_1398429077682_0006_02_03
2014-04-25 12:47:45,630 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
 Start request for container_1398429077682_0006_02_05 by user oleg
2014-04-25 12:47:45,631 INFO 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=oleg 
IP=192.168.19.10OPERATION=Start Container Request   
TARGET=ContainerManageImpl  RESULT=SUCCESS  
APPID=application_1398429077682_0006
CONTAINERID=container_1398429077682_0006_02_05
2014-04-25 12:47:45,631 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
 Adding container_1398429077682_0006_02_05 to application 
application_1398429077682_0006
2014-04-25 12:47:45,632 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1398429077682_0006_02_05 transitioned from NEW to 
LOCALIZING
2014-04-25 12:47:45,632 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got 
event CONTAINER_INIT for appId application_1398429077682_0006
2014-04-25 12:47:45,634 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1398429077682_0006_02_05 transitioned from LOCALIZING 
to LOCALIZED
2014-04-25 12:47:45,660 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1398429077682_0006_02_05 transitioned from LOCALIZED to 
RUNNING
2014-04-25 12:47:45,677 INFO 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: 
launchContainer: [nice, -n, 0, bash, 
/tmp/hadoop-oleg/nm-local-dir/usercache/oleg/appcache/application_1398429077682_0006/container_1398429077682_0006_02_05/default_container_executor.sh]
2014-04-25 12:47:48,230 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Starting resource-monitoring for container_1398429077682_0006_02_05
2014-04-25 12:47:48,248 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Memory usage of ProcessTree 12598 for container-id 
container_1398429077682_0006_02_05: 39.7 MB of 256 MB physical memory used; 
1.8 GB of 537.6 MB virtual memory used
2014-04-25 12:47:48,248 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Process tree for container: container_1398429077682_0006_02_05 running 
over twice the configured limit. Limit=563714432, current usage = 1985445888
2014-04-25 12:47:48,249 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Container [pid=12598,containerID=container_1398429077682_0006_02_05] is 
running beyond virtual memory limits. Current usage: 39.7 MB of 256 MB physical 
memory used; 1.8 GB of 537.6 MB virtual memory used. Killing container.
Dump of the process-tree for container_1398429077682_0006_02_05 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
. . .
{code}

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981322#comment-13981322
 ] 

Oleg Zhurakousky commented on YARN-1985:


I'd agree with you by the reported status is 0.
{code}
ExitStatus: 0, ]
{code}

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1681) When "banned.users" is not set in LCE's container-executor.cfg, submit job with user in DEFAULT_BANNED_USERS will receive unclear error message

2014-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1681:
--

Labels: container usability  (was: container)

> When "banned.users" is not set in LCE's container-executor.cfg, submit job 
> with user in DEFAULT_BANNED_USERS will receive unclear error message
> ---
>
> Key: YARN-1681
> URL: https://issues.apache.org/jira/browse/YARN-1681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Zhichun Wu
>Priority: Minor
>  Labels: container, usability
> Attachments: YARN-1681.patch
>
>
> When using LCE in a secure setup, if "banned.users" is not set in 
> container-executor.cfg, submit job with user in DEFAULT_BANNED_USERS 
> ("mapred", "hdfs", "bin", 0)  will receive unclear error message.
> for example, if we use hdfs to submit a mr job, we may see the following the 
> yarn app overview page:
> {code}
> appattempt_1391353981633_0003_02 exited with exitCode: -1000 due to: 
> Application application_1391353981633_0003 initialization failed 
> (exitCode=139) with output: 
> {code}
> while the prefer error message may look like:
> {code}
> appattempt_1391353981633_0003_02 exited with exitCode: -1000 due to: 
> Application application_1391353981633_0003 initialization failed 
> (exitCode=139) with output: Requested user hdfs is banned 
> {code}
> just a minor bug and I would like to start contributing to hadoop-common with 
> it:)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981312#comment-13981312
 ] 

Jason Lowe commented on YARN-1985:
--

There are only three states for a container: NEW, RUNNING, or COMPLETED.  Note 
that COMPLETED does not imply success rather that the container is no longer 
running.  In order to discern success or failure from a completed container one 
must examine the exit code of the container (i.e.: the 
ContainerStatus#getExitStatus method).

Are both containers running over their memory limits or is only one running 
over and somehow both are being killed?  That's where the RM/NM logs would help.

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981302#comment-13981302
 ] 

Oleg Zhurakousky commented on YARN-1985:


Actually as I stated in my last comment its 2 containers. So out of four 2 were 
started and killed immediately and 2 were not started at all. So while I have 
to fix my problem of properly counting how many containers were started vs 
finished/running, the real issue is that such a major error condition is 
reported essentially as success. Basically if I didn't have a bug on my end 
which made my AM hang, i would probably end up seeing SUCCEEDED in the RM 
console, I am fixing it now and will follow up if I do see SUCCEEDED. But in 
any event its confusing when it simply reports COMPLETED while describing a 
major error.

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981296#comment-13981296
 ] 

Jason Lowe commented on YARN-1985:
--

Do you have the relevant portions of the RM log for these 4 containers showing 
it has marked them completed?  If these all occurred on the same node, the 
relevant NM log would be great as well.

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981285#comment-13981285
 ] 

Oleg Zhurakousky commented on YARN-1985:


Actually, the 4 COMPLETE reports are log duplication, so it is actually two. 
The other two containers didn't even start. Which is fine, but the real issue 
is that while its clearly an ERROR condition it reports is as simple COMPLETE.

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly

2014-04-25 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1984:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1530

> LeveldbTimelineStore does not handle db exceptions properly
> ---
>
> Key: YARN-1984
> URL: https://issues.apache.org/jira/browse/YARN-1984
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Jason Lowe
>
> The org.iq80.leveldb.DB and DBIterator methods throw runtime exceptions 
> rather than IOException which can easily leak up the stack and kill threads 
> (e.g.: the deletion thread).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981264#comment-13981264
 ] 

Oleg Zhurakousky commented on YARN-1985:


Just adding more info. Foe my 4 container app I get a single
{code}
- Received completed contaners callback: [ContainerStatus: [ContainerId: 
container_1398429077682_0005_01_03, State: COMPLETE, Diagnostics: Container 
[pid=11152,containerID=container_1398429077682_0005_01_03] is running 
beyond virtual memory limits. Current usage: 39.6 MB of 256 MB physical memory 
used; 1.8 GB of 537.6 MB virtual memory used. Killing container.
. . .
{code}
and then 4
{code}
State: COMPLETE,. . .
State: COMPLETE,. . .
State: COMPLETE,. . .
State: COMPLETE,. . .
{code}

> YARN issues wrong state when "running beyond virtual memory limits"
> ---
>
> Key: YARN-1985
> URL: https://issues.apache.org/jira/browse/YARN-1985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> When deploying YARN application with multiple containers and AM determines 
> that the resource limits been reached (e.g., virtual memory) it starts 
> killing *all* containers while reporting a *single* COMPLETED status 
> essentially hanging AM waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1985) YARN issues wrong state when "running beyond virtual memory limits"

2014-04-25 Thread Oleg Zhurakousky (JIRA)
Oleg Zhurakousky created YARN-1985:
--

 Summary: YARN issues wrong state when "running beyond virtual 
memory limits"
 Key: YARN-1985
 URL: https://issues.apache.org/jira/browse/YARN-1985
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Oleg Zhurakousky


When deploying YARN application with multiple containers and AM determines that 
the resource limits been reached (e.g., virtual memory) it starts killing *all* 
containers while reporting a *single* COMPLETED status essentially hanging AM 
waiting for more containers to report its state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1941) Yarn scheduler ACL improvement

2014-04-25 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981190#comment-13981190
 ] 

Remus Rusanu commented on YARN-1941:


Keeping the root at default _acl goes agains 'secure by default' principle. 
This is not an issue of consistency between root and and other queues, but the 
issue is the hierarchical check: leaf queues deffer to parent:
{code}
  public boolean hasAccess(QueueACL acl, UserGroupInformation user) {
// Check if the leaf-queue allows access
synchronized (this) {
  if (acls.get(acl).isUserAllowed(user)) {
return true;
  }
}
// Check if parent-queue allows access
return getParent().hasAccess(acl, user);
  }
{code}

and parent's further deffer to their parents:

{code}
  @Override
  public boolean hasAccess(QueueACL acl, UserGroupInformation user) {
synchronized (this) {
  if (acls.get(acl).isUserAllowed(user)) {
return true;
  }
}

if (parent != null) {
  return parent.hasAccess(acl, user);
}

return false;
  }
{code}

So ultimately the root ACLs cover every queue. With default being '*', all 
queues get access by everyone. This is a fairly bad 'default' to have.

> Yarn scheduler ACL improvement
> --
>
> Key: YARN-1941
> URL: https://issues.apache.org/jira/browse/YARN-1941
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.3.0
>Reporter: Gordon Wang
>Assignee: Gordon Wang
>  Labels: scheduler
>
> Defect:
> 1. Currently, in Yarn Capacity Scheduler and Yarn Fair Scheduler, the queue 
> ACL is always checked when submitting a app to scheduler, regardless of the 
> property "yarn.acl.enable".
> But for killing an app, the ACL is checked when yarn.acl.enable is set.
> The behaviour is not consistent.
> 2. default ACL for root queue is EVERYBODY_ACL( * ), while default ACL for 
> other queues is NODODY_ACL( ). From users' view, this is error prone and not 
> easy to understand the ACL policy of Yarn scheduler. root queue should not be 
> so special compared with other parent queues.
> For example, if I want to set capacity scheduler ACL, the ACL of root has to 
> be set explicitly. Otherwise, everyone can submit APP to yarn scheduler. 
> Because root queue ACL is EVERYBODY_ACL.
> This is hard for user to administrate yarn scheduler.
> So, I propose to improve the ACL of yarn scheduler in the following aspects.
> 1. only enable scheduler queue ACL when yarn.acl.enable is set to true.
> 2. set the default ACL of root queue as NOBODY_ACL( ). Make all the parent 
> queues' ACL consistent.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly

2014-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981185#comment-13981185
 ] 

Jason Lowe commented on YARN-1984:
--

Ran across this while working with leveldb as part of MAPREDUCE-5652 and 
YARN-1336.  There are two DBExceptions, NativeDB.DBException and 
leveldb.DBException.  The former is derived from  IOException raised by the low 
level JNI code, while the latter is derived from RuntimeException and is thrown 
by the JniDB wrapper code.  To make matters worse, DBIterator throws _raw_ 
RuntimeException rather than the runtime DBException from its methods, so 
database errors can leak up the stack even if code is expecting the runtime 
DBException.

The timeline store should be handling the runtime exceptions and treat them 
like I/O errors, at least to keep it from tearing down the deletion thread (if 
not other cases).

We may want to create a wrapper utility class for DBIterator in YARN as a 
workaround so interacting with the database only requires handling of 
leveldb.DBException rather than also trying to wrestle with the raw 
RuntimeExceptions from the iterator.  See the DBIterator wrapper class in 
https://issues.apache.org/jira/secure/attachment/12641927/MAPREDUCE-5652-v8.patch
 as a rough example.

> LeveldbTimelineStore does not handle db exceptions properly
> ---
>
> Key: YARN-1984
> URL: https://issues.apache.org/jira/browse/YARN-1984
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jason Lowe
>
> The org.iq80.leveldb.DB and DBIterator methods throw runtime exceptions 
> rather than IOException which can easily leak up the stack and kill threads 
> (e.g.: the deletion thread).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1984) LeveldbTimelineStore does not handle db exceptions properly

2014-04-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1984:


 Summary: LeveldbTimelineStore does not handle db exceptions 
properly
 Key: YARN-1984
 URL: https://issues.apache.org/jira/browse/YARN-1984
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Jason Lowe


The org.iq80.leveldb.DB and DBIterator methods throw runtime exceptions rather 
than IOException which can easily leak up the stack and kill threads (e.g.: the 
deletion thread).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1975) Used resources shows escaped html in CapacityScheduler and FairScheduler page

2014-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981005#comment-13981005
 ] 

Hudson commented on YARN-1975:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1742 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1742/])
YARN-1975. Used resources shows escaped html in CapacityScheduler and 
FairScheduler page. Contributed by Mit Desai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1589859)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java


> Used resources shows escaped html in CapacityScheduler and FairScheduler page
> -
>
> Key: YARN-1975
> URL: https://issues.apache.org/jira/browse/YARN-1975
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Nathan Roberts
>Assignee: Mit Desai
> Fix For: 3.0.0, 2.4.1
>
> Attachments: YARN-1975.patch, screenshot-1975.png
>
>
> Used resources displays as <memory:, vCores;> with capacity 
> scheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1975) Used resources shows escaped html in CapacityScheduler and FairScheduler page

2014-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980994#comment-13980994
 ] 

Hudson commented on YARN-1975:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1768 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1768/])
YARN-1975. Used resources shows escaped html in CapacityScheduler and 
FairScheduler page. Contributed by Mit Desai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1589859)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java


> Used resources shows escaped html in CapacityScheduler and FairScheduler page
> -
>
> Key: YARN-1975
> URL: https://issues.apache.org/jira/browse/YARN-1975
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Nathan Roberts
>Assignee: Mit Desai
> Fix For: 3.0.0, 2.4.1
>
> Attachments: YARN-1975.patch, screenshot-1975.png
>
>
> Used resources displays as <memory:, vCores;> with capacity 
> scheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1975) Used resources shows escaped html in CapacityScheduler and FairScheduler page

2014-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980931#comment-13980931
 ] 

Hudson commented on YARN-1975:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #551 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/551/])
YARN-1975. Used resources shows escaped html in CapacityScheduler and 
FairScheduler page. Contributed by Mit Desai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1589859)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java


> Used resources shows escaped html in CapacityScheduler and FairScheduler page
> -
>
> Key: YARN-1975
> URL: https://issues.apache.org/jira/browse/YARN-1975
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Nathan Roberts
>Assignee: Mit Desai
> Fix For: 3.0.0, 2.4.1
>
> Attachments: YARN-1975.patch, screenshot-1975.png
>
>
> Used resources displays as <memory:, vCores;> with capacity 
> scheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1972:
---

Labels: security windows  (was: )

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1972:
---

Component/s: nodemanager

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1972:
---

Attachment: YARN-1972.1.patch

Iteration 1. I will upload a short design soon to make the dry code a more 
palatable read.

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
> Attachments: YARN-1972.1.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980754#comment-13980754
 ] 

Remus Rusanu commented on YARN-1063:


patch applies fine on trunk for me. Not sure why it failed for Mr. Jenkins. I 
removed the trunk-win tags.

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1063:
---

Affects Version/s: (was: trunk-win)

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu reassigned YARN-1063:
--

Assignee: Remus Rusanu

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1063:
---

Labels: security windows  (was: security)

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1063:
---

Target Version/s:   (was: trunk-win)

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1063) Winutils needs ability to create task as domain user

2014-04-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1063:
---

Fix Version/s: (was: trunk-win)

> Winutils needs ability to create task as domain user
> 
>
> Key: YARN-1063
> URL: https://issues.apache.org/jira/browse/YARN-1063
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
> Environment: Windows
>Reporter: Kyle Leckie
>  Labels: security, windows
> Attachments: YARN-1063.2.patch, YARN-1063.patch
>
>
> h1. Summary:
> Securing a Hadoop cluster requires constructing some form of security 
> boundary around the processes executed in YARN containers. Isolation based on 
> Windows user isolation seems most feasible. This approach is similar to the 
> approach taken by the existing LinuxContainerExecutor. The current patch to 
> winutils.exe adds the ability to create a process as a domain user. 
> h1. Alternative Methods considered:
> h2. Process rights limited by security token restriction:
> On Windows access decisions are made by examining the security token of a 
> process. It is possible to spawn a process with a restricted security token. 
> Any of the rights granted by SIDs of the default token may be restricted. It 
> is possible to see this in action by examining the security tone of a 
> sandboxed process launch be a web browser. Typically the launched process 
> will have a fully restricted token and need to access machine resources 
> through a dedicated broker process that enforces a custom security policy. 
> This broker process mechanism would break compatibility with the typical 
> Hadoop container process. The Container process must be able to utilize 
> standard function calls for disk and network IO. I performed some work 
> looking at ways to ACL the local files to the specific launched without 
> granting rights to other processes launched on the same machine but found 
> this to be an overly complex solution. 
> h2. Relying on APP containers:
> Recent versions of windows have the ability to launch processes within an 
> isolated container. Application containers are supported for execution of 
> WinRT based executables. This method was ruled out due to the lack of 
> official support for standard windows APIs. At some point in the future 
> windows may support functionality similar to BSD jails or Linux containers, 
> at that point support for containers should be added.
> h1. Create As User Feature Description:
> h2. Usage:
> A new sub command was added to the set of task commands. Here is the syntax:
> winutils task createAsUser [TASKNAME] [USERNAME] [COMMAND_LINE]
> Some notes:
> * The username specified is in the format of "user@domain"
> * The machine executing this command must be joined to the domain of the user 
> specified
> * The domain controller must allow the account executing the command access 
> to the user information. For this join the account to the predefined group 
> labeled "Pre-Windows 2000 Compatible Access"
> * The account running the command must have several rights on the local 
> machine. These can be managed manually using secpol.msc: 
> ** "Act as part of the operating system" - SE_TCB_NAME
> ** "Replace a process-level token" - SE_ASSIGNPRIMARYTOKEN_NAME
> ** "Adjust memory quotas for a process" - SE_INCREASE_QUOTA_NAME
> * The launched process will not have rights to the desktop so will not be 
> able to display any information or create UI.
> * The launched process will have no network credentials. Any access of 
> network resources that requires domain authentication will fail.
> h2. Implementation:
> Winutils performs the following steps:
> # Enable the required privileges for the current process.
> # Register as a trusted process with the Local Security Authority (LSA).
> # Create a new logon for the user passed on the command line.
> # Load/Create a profile on the local machine for the new logon.
> # Create a new environment for the new logon.
> # Launch the new process in a job with the task name specified and using the 
> created logon.
> # Wait for the JOB to exit.
> h2. Future work:
> The following work was scoped out of this check in:
> * Support for non-domain users or machine that are not domain joined.
> * Support for privilege isolation by running the task launcher in a high 
> privilege service with access over an ACLed named pipe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)