[jira] [Commented] (YARN-1917) Add waitForApplicationState interface to YarnClient

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975428#comment-13975428
 ] 

Hadoop QA commented on YARN-1917:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641019/YARN-1917.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3599//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3599//console

This message is automatically generated.

 Add waitForApplicationState interface to YarnClient
 -

 Key: YARN-1917
 URL: https://issues.apache.org/jira/browse/YARN-1917
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: client
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-1917.patch, YARN-1917.patch, YARN-1917.patch


 Currently, YARN dosen't have this method. Users needs to write 
 implementations like UnmanagedAMLauncher.monitorApplication or 
 mapreduce.Job.monitorAndPrintJob on their own. This feature should be helpful 
 to end users.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated YARN-1962:


Attachment: YARN-1962.1.patch

Thanks [~vinodkv].

Patch added

 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at 

[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975499#comment-13975499
 ] 

Tsuyoshi OZAWA commented on YARN-1879:
--

[~xgong] and [~jianhe], do you have additional comments or opinions? Let me 
know if you have unclear points.

 Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
 ---

 Key: YARN-1879
 URL: https://issues.apache.org/jira/browse/YARN-1879
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Tsuyoshi OZAWA
Priority: Critical
 Attachments: YARN-1879.1.patch, YARN-1879.1.patch, 
 YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.3.patch, 
 YARN-1879.4.patch, YARN-1879.5.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975504#comment-13975504
 ] 

Hadoop QA commented on YARN-1962:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641040/YARN-1962.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3600//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3600//console

This message is automatically generated.

 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at 

[jira] [Updated] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-1966:
--

Attachment: Yarn-1966.1.patch

Here issue is happening because while submitting an application, hasAccess() 
check will always reach ParentQueue::hasAccess(). 
In this case, finally parent will come as root and it will pass this check. 
(* is default for acl_submit_applications and acl_administer_queue in root 
queue)

So to ensure only one specified user to submit job in a leaf queue, below 
configurations are mandatory in root
  root.acl_submit_applications 
  root.acl_administer_queue

To submit a job, acl_administer_queue check has no relevance. But we are forced 
to configure this also, if we want to achieve what is mentioned in the problem 
statement of this issue. 
Also if each leaf queue wants to have its own set of users, all users finally 
are to be mentioned in root. This is not good.

So it is better to skip hasAccess() check if parent Queue is root as below
if(rootQueue){
   return false;
}

 Capacity Scheduler acl_submit_applications in Leaf Queue finally considers 
 root queue default always
 

 Key: YARN-1966
 URL: https://issues.apache.org/jira/browse/YARN-1966
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Sunil G
 Attachments: Yarn-1966.1.patch


 Given with below configurations,
 property
   nameyarn.scheduler.capacity.root.queues/name
   valuefast,medium/value
 /property
 property
   nameyarn.scheduler.capacity.root.fast.acl_submit_applications/name
   valuehadoop/value
 /property
 property
   nameyarn.scheduler.capacity.root.slow.acl_submit_applications/name
   valuehadoop/value
 /property
 In this case, the expectation is like hadoop user can only submit job to 
 fast or slow queue.
 But now any user can submit job to these queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975541#comment-13975541
 ] 

Tsuyoshi OZAWA commented on YARN-556:
-

[~adhoot], I glanced over your patch.

1. Can you split your code into each subtasks? Your patch includes overall 
changes of this task. We should discuss small points on each subtask JIRA.
2. IMO, prototype is enough to validate the design. Do you have any additional 
comments about design docs?

I'd like to include this feature in 2.5.0(maybe May - June?), so let's work 
togather :-)

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf, 
 WorkPreservingRestartPrototype.001.patch


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1657) Timeout occurs in TestNMClient

2014-04-21 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved YARN-1657.
-

Resolution: Cannot Reproduce

 Timeout occurs in TestNMClient
 --

 Key: YARN-1657
 URL: https://issues.apache.org/jira/browse/YARN-1657
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0
Reporter: Akira AJISAKA

 A timeout occurs in TestNMClient when a patch is tested by Jenkins.
 The following comment can be seen in YARN-1480, YARN-1611, and YARN-888.
 {code}
 {color:red}-1 core tests{color}.  The following test timeouts occurred in 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:
 org.apache.hadoop.yarn.client.api.impl.TestNMClient
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1967) onShutdownRequest() is not called when AMRMClientAsyncImpl#unregisterApplicationMaster() is called

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)
Tsuyoshi OZAWA created YARN-1967:


 Summary: onShutdownRequest() is not called when 
AMRMClientAsyncImpl#unregisterApplicationMaster() is called
 Key: YARN-1967
 URL: https://issues.apache.org/jira/browse/YARN-1967
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA


When I taking on YARN-1954, I found that onShutdownRequest() is not called when 
AMRMClientAsyncImpl#unregisterApplicationMaster() is called.
Should we fix it by calling onShutdownRequest() or adding new 
hook(onUnregisteredApplicationRequest) to CallbackHandler?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-1966.
--

Resolution: Duplicate

This is a duplicate of YARN-1269 and related to YARN-1941 and YARN-1951.

In any case I don't think we want to special-case the root queue, as the same 
issue could exist in a subtree where access to the subtree root allows access 
to any queue within the subtree.

Actually I believe this is by design.  It allows admins to configure access to 
an entire subtree of queues by giving access to the root of the subtree rather 
than having to add the access to each leaf queue.  So for your example above 
you'll want to set the root queue's ACLs to be empty so that one must have 
access to the leaf queue in order to submit.

 Capacity Scheduler acl_submit_applications in Leaf Queue finally considers 
 root queue default always
 

 Key: YARN-1966
 URL: https://issues.apache.org/jira/browse/YARN-1966
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Sunil G
 Attachments: Yarn-1966.1.patch


 Given with below configurations,
 property
   nameyarn.scheduler.capacity.root.queues/name
   valuefast,medium/value
 /property
 property
   nameyarn.scheduler.capacity.root.fast.acl_submit_applications/name
   valuehadoop/value
 /property
 property
   nameyarn.scheduler.capacity.root.slow.acl_submit_applications/name
   valuehadoop/value
 /property
 In this case, the expectation is like hadoop user can only submit job to 
 fast or slow queue.
 But now any user can submit job to these queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread jay vyas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975599#comment-13975599
 ] 

jay vyas commented on YARN-1964:


This takes us away from the java idiom of packaging apps as jars  Its a 
pretty bold step so let me play devils advocate, because I think it might not 
be the best idea.

1) Tying yarn's JVM NodeManagers to LCE's adds a new dependency to the YARN 
stack.adds new  building/compiling/maintaining costs.

2) NodeManagers are being run in LinuxContainers quite commonly.  This could 
lead to containers, launching NM's, which again launch containers.Seems 
kind of yucky dont you think?

3) It forces yarn to be aware of linux containers. This might lead to  
encouraging the creation of application code that doesnt run easily outside of 
containerized environment.  As awesome as LCE's are, most java and bigdata apps 
running in YARN with the hadoop .

 Support Docker containers in YARN
 -

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975610#comment-13975610
 ] 

Tsuyoshi OZAWA commented on YARN-1964:
--

Do you mean we will have DockerContainerExecutor instead of 
LinuxContainerExecutor?

 Support Docker containers in YARN
 -

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975687#comment-13975687
 ] 

Hadoop QA commented on YARN-1954:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641071/YARN-1954.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3601//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3601//console

This message is automatically generated.

 Add waitFor to AMRMClient(Async)
 

 Key: YARN-1954
 URL: https://issues.apache.org/jira/browse/YARN-1954
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: client
Affects Versions: 3.0.0, 2.4.0
Reporter: Zhijie Shen
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch


 Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
 that the main non-daemon thread has to sit in a dummy loop to prevent AM 
 process exiting before all the tasks are done, while unregistration is 
 triggered on a separate another daemon thread by callback methods (in 
 particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
 a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
 user supplied check point, such that users don't need to write the loop 
 themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1968) YARN Admin service should have more fine-grained ACL which is based on mapping of users with methods/operations.

2014-04-21 Thread Junping Du (JIRA)
Junping Du created YARN-1968:


 Summary: YARN Admin service should have more fine-grained ACL 
which is based on mapping of users with methods/operations.
 Key: YARN-1968
 URL: https://issues.apache.org/jira/browse/YARN-1968
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du


AdminService's operation today have different dimensions of management, some is 
on user management while other is on cluster management. 
Today, we only check if user belongs to some authorized group to see if he can 
execute operations in admin service. The result is he can either execute all 
operations or none which is a simple strategy but not very precisely. We may 
need more fine-grained ACLs which can authorized user with partial operations 
in AdminService.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1969) Earliest Deadline First Scheduling in the Fair Scheduler

2014-04-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1969:
-

Summary: Earliest Deadline First Scheduling in the Fair Scheduler  (was: 
Earliest Deadline First Scheduling)

 Earliest Deadline First Scheduling in the Fair Scheduler
 

 Key: YARN-1969
 URL: https://issues.apache.org/jira/browse/YARN-1969
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 What we are observing is that some big jobs with many allocated containers 
 are waiting for a few containers to finish. Under *fair-share scheduling* 
 however they have a low priority since there are other jobs (usually much 
 smaller, new comers) that are using resources way below their fair share, 
 hence new released containers are not offered to the big, yet 
 close-to-be-finished job. Nevertheless, everybody would benefit from an 
 unfair scheduling that offers the resource to the big job since the sooner 
 the big job finishes, the sooner it releases its many allocated resources 
 to be used by other jobs.In other words, what we require is a kind of 
 variation of *Earliest Deadline First scheduling*, that takes into account 
 the number of already-allocated resources and estimated time to finish.
 http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
 For example, if a job is using MEM GB of memory and is expected to finish in 
 TIME minutes, the priority in scheduling would be a function p of (MEM, 
 TIME). The expected time to finish can be estimated by the AppMaster using 
 TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
 request messages. To be less susceptible to the issue of apps gaming the 
 system, we can have this scheduling limited to *only within a queue*: i.e., 
 adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
 to use it by setting the schedulingPolicy field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975885#comment-13975885
 ] 

Mohammad Kamrul Islam commented on YARN-1962:
-

[~zjshen] Thanks for the feedback.
I will upload a new patch.


 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at 

[jira] [Commented] (YARN-313) Add Admin API for supporting node resource configuration in command line

2014-04-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976010#comment-13976010
 ] 

Junping Du commented on YARN-313:
-

Thanks [~kj-ki] for contributing a sample patch here. Although I think some 
code in sample patch is duplicated with YARN-312 that we already have (the 
proto staff on refreshResource), I will check if some code here can be 
integrated with my patch after YARN-1506 is figured out.

 Add Admin API for supporting node resource configuration in command line
 

 Key: YARN-313
 URL: https://issues.apache.org/jira/browse/YARN-313
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-313-sample.patch


 We should provide some admin interface, e.g. yarn rmadmin -refreshResources 
 to support changes of node's resource specified in a config file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-04-21 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976056#comment-13976056
 ] 

Jian He commented on YARN-1506:
---

some comments on the patch, mostly cosmetic changes:
- Since it’s changed to asynchronous, we may change the log to not say 
*successfully*.
{code}
LOG.info(Update resource successfully on node( + node.getNodeID()
+) with resource( + newResourceOption.toString() + ));
{code}
- Log inside UpdateNodeResourceWhenNonRunningTransition, good for debugging as 
this should be an unusual case. UpdateNodeResourceWhenNonRunningTransition - 
UpdateNodeResourceWhenNotRunningTransition ?
- IMO, since UpdateNodeResourceWhenUnusableTransition and 
UpdateNodeResourceWhenNonRunningTransition are the same except one extra 
logging, we can do the logging for both and just keep one transition?
- if possible, nodeResourceUpdate method can be moved into 
AbstractYarnScheduler, a new common base class for sharing common code among 
all the schedulers.
- SchedulerNode.setTotalResource - 
SchedulerNode.updateTotalAndAvailableResource() ?
- UpdateNodeResourceResponse should be an abstract class which implements 
newInstance() method.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-1506-v1.patch, YARN-1506-v10.patch, 
 YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, 
 YARN-1506-v5.patch, YARN-1506-v6.patch, YARN-1506-v7.patch, 
 YARN-1506-v8.patch, YARN-1506-v9.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-04-21 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976061#comment-13976061
 ] 

Jian He commented on YARN-1506:
---

AdminService.updateNodeResource should RMAuditLogger to log the operations as 
well.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-1506-v1.patch, YARN-1506-v10.patch, 
 YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, 
 YARN-1506-v5.patch, YARN-1506-v6.patch, YARN-1506-v7.patch, 
 YARN-1506-v8.patch, YARN-1506-v9.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated YARN-1962:


Attachment: YARN-1962.2.patch

Patch with review comments.

 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch, YARN-1962.2.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at 

[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976094#comment-13976094
 ] 

Mohammad Kamrul Islam commented on YARN-1962:
-

Testing done:
1. Tested in 2.4.0 cluster of 100 nodes with [~tthompso] 
2. Ran the relevant unit test including the new one.

 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch, YARN-1962.2.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 

[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976127#comment-13976127
 ] 

Hadoop QA commented on YARN-1962:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641129/YARN-1962.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3603//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3603//console

This message is automatically generated.

 Timeline server is enabled by default
 -

 Key: YARN-1962
 URL: https://issues.apache.org/jira/browse/YARN-1962
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: YARN-1962.1.patch, YARN-1962.2.patch


 Since Timeline server is not matured and secured yet, enabling  it by default 
 might create some confusion.
 We were playing with 2.4.0 and found a lot of exceptions for distributed 
 shell example related to connection refused error. Btw, we didn't run TS 
 because it is not secured yet.
 Although it is possible to explicitly turn it off through yarn-site config. 
 In my opinion,  this extra change for this new service is not worthy at this 
 point,.  
 This JIRA is to turn it off by default.
 If there is an agreement, i can put a simple patch about this.
 {noformat}
 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
 from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
   at com.sun.jersey.api.client.Client.handle(Client.java:648)
   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
   at 
 com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
   at 
 org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
 Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
   at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
   at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:579)
   at java.net.Socket.connect(Socket.java:528)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR 
 impl.TimelineClientImpl: Failed to get the response from the timeline server.
 com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
 Connection refused
   at 
 

[jira] [Commented] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976205#comment-13976205
 ] 

Hudson commented on YARN-1970:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5547 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5547/])
YARN-1970. Prepare YARN codebase for JUnit 4.11. Contributed by Chris Nauroth. 
(cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1589001)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/utils/TestSLSUtils.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/web/TestSLSWebApp.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


 Prepare YARN codebase for JUnit 4.11.
 -

 Key: YARN-1970
 URL: https://issues.apache.org/jira/browse/YARN-1970
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.4.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: YARN-1970.1.patch


 HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
 YARN code needs some minor updates to fix deprecation warnings and test 
 isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1969) Fair Scheduler: Add policy for Earliest Deadline First

2014-04-21 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1969:
---

Summary: Fair Scheduler: Add policy for Earliest Deadline First  (was: 
Earliest Deadline First Scheduling in the Fair Scheduler)

 Fair Scheduler: Add policy for Earliest Deadline First
 --

 Key: YARN-1969
 URL: https://issues.apache.org/jira/browse/YARN-1969
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 What we are observing is that some big jobs with many allocated containers 
 are waiting for a few containers to finish. Under *fair-share scheduling* 
 however they have a low priority since there are other jobs (usually much 
 smaller, new comers) that are using resources way below their fair share, 
 hence new released containers are not offered to the big, yet 
 close-to-be-finished job. Nevertheless, everybody would benefit from an 
 unfair scheduling that offers the resource to the big job since the sooner 
 the big job finishes, the sooner it releases its many allocated resources 
 to be used by other jobs.In other words, what we require is a kind of 
 variation of *Earliest Deadline First scheduling*, that takes into account 
 the number of already-allocated resources and estimated time to finish.
 http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
 For example, if a job is using MEM GB of memory and is expected to finish in 
 TIME minutes, the priority in scheduling would be a function p of (MEM, 
 TIME). The expected time to finish can be estimated by the AppMaster using 
 TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
 request messages. To be less susceptible to the issue of apps gaming the 
 system, we can have this scheduling limited to *only within a queue*: i.e., 
 adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
 to use it by setting the schedulingPolicy field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1897:


Attachment: YARN-1897.1.patch

 Define SignalContainerRequest and SignalContainerResponse
 -

 Key: YARN-1897
 URL: https://issues.apache.org/jira/browse/YARN-1897
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Ming Ma
 Attachments: YARN-1897.1.patch


 We need to define SignalContainerRequest and SignalContainerResponse first as 
 they are needed by other sub tasks. SignalContainerRequest should use 
 OS-independent commands and provide a way to application to specify reason 
 for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976268#comment-13976268
 ] 

Xuan Gong commented on YARN-1897:
-

[~mingma]
I uploaded an initial patch for this. Please take a look and feel free to do 
any editions, renaming, etc.


 Define SignalContainerRequest and SignalContainerResponse
 -

 Key: YARN-1897
 URL: https://issues.apache.org/jira/browse/YARN-1897
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Ming Ma
 Attachments: YARN-1897.1.patch


 We need to define SignalContainerRequest and SignalContainerResponse first as 
 they are needed by other sub tasks. SignalContainerRequest should use 
 OS-independent commands and provide a way to application to specify reason 
 for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976278#comment-13976278
 ] 

Hadoop QA commented on YARN-1796:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633282/YARN-1796.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3604//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3604//console

This message is automatically generated.

 container-executor shouldn't require o-r permissions
 

 Key: YARN-1796
 URL: https://issues.apache.org/jira/browse/YARN-1796
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: YARN-1796.patch


 The container-executor currently checks that other users don't have read 
 permissions. This is unnecessary and runs contrary to the debian packaging 
 policy manual.
 This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-04-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976282#comment-13976282
 ] 

Sandy Ryza commented on YARN-1796:
--

+1

 container-executor shouldn't require o-r permissions
 

 Key: YARN-1796
 URL: https://issues.apache.org/jira/browse/YARN-1796
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: YARN-1796.patch


 The container-executor currently checks that other users don't have read 
 permissions. This is unnecessary and runs contrary to the debian packaging 
 policy manual.
 This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976310#comment-13976310
 ] 

Ming Ma commented on YARN-1897:
---

Thanks, Xuan. I will merge this one with the version I have and provide an 
update shortly. BTW, why does SignalContainerResponse needs to provide 
diagnosis string, to explain why the request can't be processed?

 Define SignalContainerRequest and SignalContainerResponse
 -

 Key: YARN-1897
 URL: https://issues.apache.org/jira/browse/YARN-1897
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Ming Ma
 Attachments: YARN-1897.1.patch


 We need to define SignalContainerRequest and SignalContainerResponse first as 
 they are needed by other sub tasks. SignalContainerRequest should use 
 OS-independent commands and provide a way to application to specify reason 
 for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976408#comment-13976408
 ] 

Hadoop QA commented on YARN-1897:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641175/YARN-1897-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/3605//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3605//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3605//console

This message is automatically generated.

 Define SignalContainerRequest and SignalContainerResponse
 -

 Key: YARN-1897
 URL: https://issues.apache.org/jira/browse/YARN-1897
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Ming Ma
 Attachments: YARN-1897-2.patch, YARN-1897.1.patch


 We need to define SignalContainerRequest and SignalContainerResponse first as 
 they are needed by other sub tasks. SignalContainerRequest should use 
 OS-independent commands and provide a way to application to specify reason 
 for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)