[jira] [Commented] (YARN-1418) Add Tracing to YARN

2014-04-21 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976465#comment-13976465
 ] 

Masatake Iwasaki commented on YARN-1418:


One of the YARN specific todo is adding the way to pass tracing information to 
containers forked from NodeManagers. Using configuration property is 
straightforward.

> Add Tracing to YARN
> ---
>
> Key: YARN-1418
> URL: https://issues.apache.org/jira/browse/YARN-1418
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, nodemanager, resourcemanager
>Reporter: Masatake Iwasaki
>
> Adding tracing using HTrace in the same way as HBASE-6449 and HDFS-5274.
> The most part of changes needed for basis such as RPC seems to be almost 
> ready in HDFS-5274.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976408#comment-13976408
 ] 

Hadoop QA commented on YARN-1897:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641175/YARN-1897-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/3605//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3605//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3605//console

This message is automatically generated.

> Define SignalContainerRequest and SignalContainerResponse
> -
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
> Attachments: YARN-1897-2.patch, YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated YARN-1897:
--

Attachment: YARN-1897-2.patch

Xuan, thanks for the early patch. here is the updated version to expand 
SignalContainerCommand and rename some methods.

1. SignalContainerResponse has a flag to indicate the request was submitted 
successfully. If it fails, the application doesn't know why. Is that the 
diagnosis string for? Previous patch just throws exception.

2. There is no unit test just for this patch. I tested it manually with related 
changes in YarnCLI and RM to verify messages are being passed properly. When 
other work items in RM and NM are added, unit tests will be added accordingly.

> Define SignalContainerRequest and SignalContainerResponse
> -
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
> Attachments: YARN-1897-2.patch, YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976310#comment-13976310
 ] 

Ming Ma commented on YARN-1897:
---

Thanks, Xuan. I will merge this one with the version I have and provide an 
update shortly. BTW, why does SignalContainerResponse needs to provide 
diagnosis string, to explain why the request can't be processed?

> Define SignalContainerRequest and SignalContainerResponse
> -
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
> Attachments: YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-04-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976282#comment-13976282
 ] 

Sandy Ryza commented on YARN-1796:
--

+1

> container-executor shouldn't require o-r permissions
> 
>
> Key: YARN-1796
> URL: https://issues.apache.org/jira/browse/YARN-1796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Attachments: YARN-1796.patch
>
>
> The container-executor currently checks that "other" users don't have read 
> permissions. This is unnecessary and runs contrary to the debian packaging 
> policy manual.
> This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976278#comment-13976278
 ] 

Hadoop QA commented on YARN-1796:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633282/YARN-1796.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3604//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3604//console

This message is automatically generated.

> container-executor shouldn't require o-r permissions
> 
>
> Key: YARN-1796
> URL: https://issues.apache.org/jira/browse/YARN-1796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Attachments: YARN-1796.patch
>
>
> The container-executor currently checks that "other" users don't have read 
> permissions. This is unnecessary and runs contrary to the debian packaging 
> policy manual.
> This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976268#comment-13976268
 ] 

Xuan Gong commented on YARN-1897:
-

[~mingma]
I uploaded an initial patch for this. Please take a look and feel free to do 
any editions, renaming, etc.


> Define SignalContainerRequest and SignalContainerResponse
> -
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
> Attachments: YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-04-21 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1897:


Attachment: YARN-1897.1.patch

> Define SignalContainerRequest and SignalContainerResponse
> -
>
> Key: YARN-1897
> URL: https://issues.apache.org/jira/browse/YARN-1897
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Ming Ma
> Attachments: YARN-1897.1.patch
>
>
> We need to define SignalContainerRequest and SignalContainerResponse first as 
> they are needed by other sub tasks. SignalContainerRequest should use 
> OS-independent commands and provide a way to application to specify "reason" 
> for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1969) Fair Scheduler: Add policy for Earliest Deadline First

2014-04-21 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1969:
---

Summary: Fair Scheduler: Add policy for Earliest Deadline First  (was: 
Earliest Deadline First Scheduling in the Fair Scheduler)

> Fair Scheduler: Add policy for Earliest Deadline First
> --
>
> Key: YARN-1969
> URL: https://issues.apache.org/jira/browse/YARN-1969
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
>
> What we are observing is that some big jobs with many allocated containers 
> are waiting for a few containers to finish. Under *fair-share scheduling* 
> however they have a low priority since there are other jobs (usually much 
> smaller, new comers) that are using resources way below their fair share, 
> hence new released containers are not offered to the big, yet 
> close-to-be-finished job. Nevertheless, everybody would benefit from an 
> "unfair" scheduling that offers the resource to the big job since the sooner 
> the big job finishes, the sooner it releases its "many" allocated resources 
> to be used by other jobs.In other words, what we require is a kind of 
> variation of *Earliest Deadline First scheduling*, that takes into account 
> the number of already-allocated resources and estimated time to finish.
> http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
> For example, if a job is using MEM GB of memory and is expected to finish in 
> TIME minutes, the priority in scheduling would be a function p of (MEM, 
> TIME). The expected time to finish can be estimated by the AppMaster using 
> TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
> request messages. To be less susceptible to the issue of apps gaming the 
> system, we can have this scheduling limited to *only within a queue*: i.e., 
> adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
> to use it by setting the "schedulingPolicy" field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976205#comment-13976205
 ] 

Hudson commented on YARN-1970:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5547 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5547/])
YARN-1970. Prepare YARN codebase for JUnit 4.11. Contributed by Chris Nauroth. 
(cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1589001)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/utils/TestSLSUtils.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/web/TestSLSWebApp.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


> Prepare YARN codebase for JUnit 4.11.
> -
>
> Key: YARN-1970
> URL: https://issues.apache.org/jira/browse/YARN-1970
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: YARN-1970.1.patch
>
>
> HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
> YARN code needs some minor updates to fix deprecation warnings and test 
> isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976178#comment-13976178
 ] 

Chris Nauroth commented on YARN-1970:
-

Thanks, Arpit.  I'll commit this soon.  BTW, I meant to mention that 
HADOOP-10503 contains some comments with more explanation of the need for these 
changes.

> Prepare YARN codebase for JUnit 4.11.
> -
>
> Key: YARN-1970
> URL: https://issues.apache.org/jira/browse/YARN-1970
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: YARN-1970.1.patch
>
>
> HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
> YARN code needs some minor updates to fix deprecation warnings and test 
> isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976155#comment-13976155
 ] 

Arpit Agarwal commented on YARN-1970:
-

+1 for the patch.

> Prepare YARN codebase for JUnit 4.11.
> -
>
> Key: YARN-1970
> URL: https://issues.apache.org/jira/browse/YARN-1970
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: YARN-1970.1.patch
>
>
> HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
> YARN code needs some minor updates to fix deprecation warnings and test 
> isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976127#comment-13976127
 ] 

Hadoop QA commented on YARN-1962:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641129/YARN-1962.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3603//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3603//console

This message is automatically generated.

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch, YARN-1962.2.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URL

[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976094#comment-13976094
 ] 

Mohammad Kamrul Islam commented on YARN-1962:
-

Testing done:
1. Tested in 2.4.0 cluster of 100 nodes with [~tthompso] 
2. Ran the relevant unit test including the new one.

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch, YARN-1962.2.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddre

[jira] [Updated] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated YARN-1962:


Attachment: YARN-1962.2.patch

Patch with review comments.

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch, YARN-1962.2.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.jav

[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-04-21 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976061#comment-13976061
 ] 

Jian He commented on YARN-1506:
---

AdminService.updateNodeResource should RMAuditLogger to log the operations as 
well.

> Replace set resource change on RMNode/SchedulerNode directly with event 
> notification.
> -
>
> Key: YARN-1506
> URL: https://issues.apache.org/jira/browse/YARN-1506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-1506-v1.patch, YARN-1506-v10.patch, 
> YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, 
> YARN-1506-v5.patch, YARN-1506-v6.patch, YARN-1506-v7.patch, 
> YARN-1506-v8.patch, YARN-1506-v9.patch
>
>
> According to Vinod's comments on YARN-312 
> (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
>  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-04-21 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976056#comment-13976056
 ] 

Jian He commented on YARN-1506:
---

some comments on the patch, mostly cosmetic changes:
- Since it’s changed to asynchronous, we may change the log to not say 
*successfully*.
{code}
LOG.info("Update resource successfully on node(" + node.getNodeID()
+") with resource(" + newResourceOption.toString() + ")");
{code}
- Log inside UpdateNodeResourceWhenNonRunningTransition, good for debugging as 
this should be an unusual case. UpdateNodeResourceWhenNonRunningTransition -> 
UpdateNodeResourceWhenNotRunningTransition ?
- IMO, since UpdateNodeResourceWhenUnusableTransition and 
UpdateNodeResourceWhenNonRunningTransition are the same except one extra 
logging, we can do the logging for both and just keep one transition?
- if possible, nodeResourceUpdate method can be moved into 
AbstractYarnScheduler, a new common base class for sharing common code among 
all the schedulers.
- SchedulerNode.setTotalResource -> 
SchedulerNode.updateTotalAndAvailableResource() ?
- UpdateNodeResourceResponse should be an abstract class which implements 
newInstance() method.

> Replace set resource change on RMNode/SchedulerNode directly with event 
> notification.
> -
>
> Key: YARN-1506
> URL: https://issues.apache.org/jira/browse/YARN-1506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-1506-v1.patch, YARN-1506-v10.patch, 
> YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, 
> YARN-1506-v5.patch, YARN-1506-v6.patch, YARN-1506-v7.patch, 
> YARN-1506-v8.patch, YARN-1506-v9.patch
>
>
> According to Vinod's comments on YARN-312 
> (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
>  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-313) Add Admin API for supporting node resource configuration in command line

2014-04-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976010#comment-13976010
 ] 

Junping Du commented on YARN-313:
-

Thanks [~kj-ki] for contributing a sample patch here. Although I think some 
code in sample patch is duplicated with YARN-312 that we already have (the 
proto staff on refreshResource), I will check if some code here can be 
integrated with my patch after YARN-1506 is figured out.

> Add Admin API for supporting node resource configuration in command line
> 
>
> Key: YARN-313
> URL: https://issues.apache.org/jira/browse/YARN-313
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-313-sample.patch
>
>
> We should provide some admin interface, e.g. "yarn rmadmin -refreshResources" 
> to support changes of node's resource specified in a config file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976000#comment-13976000
 ] 

Hadoop QA commented on YARN-1970:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641104/YARN-1970.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3602//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3602//console

This message is automatically generated.

> Prepare YARN codebase for JUnit 4.11.
> -
>
> Key: YARN-1970
> URL: https://issues.apache.org/jira/browse/YARN-1970
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: YARN-1970.1.patch
>
>
> HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
> YARN code needs some minor updates to fix deprecation warnings and test 
> isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread jay vyas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975937#comment-13975937
 ] 

jay vyas commented on YARN-1964:


Ah, sorry, i thought this was meant to be done at a different part of the stack 
... So is this jira is specificaly to create a "DockerContainerExectuor" class? 
then that would a really good idea, and I'm pretty sure it would be feasible.  
I guess you'd need to add a few parameters to the core-site

in core-site.xml
{noformat}
yarn.nodemanager.container-executor.class
DockerContainerExecutor
{noformat}

and then maybe have a docker-site.xml 
{noformat}
docker.container.container.impl
PythonCentOSContainer
 
docker.container.container1
PythonCentOSContainer
docker.container.container2
MyContainerWithPostgres
...
{noformat}


And then somehow localize resources in a docker-ish sort of way so that the 
containers can see all the task resources properly Is that the idea here?



> Support Docker containers in YARN
> -
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-04-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975920#comment-13975920
 ] 

Bikas Saha commented on YARN-1506:
--

This sleep is too long for a test. Same for other sleeps in the patch.
{code}+while (alloc1Response.getAllocatedContainers().size() < 1) {
+  LOG.info("Waiting for containers to be created for app 1...");
+  Thread.sleep(1000);{code}

In the test is will be useful to have 2 outstanding container requests for that 
machine (while the machine is fully booked) so we know that the scheduler will 
be trying to allocate on that machine whenever the machine heartbeats. One 
outstanding request should be for 2 GB and the other for 3GB. Also, after the 
node resource is changed, the test should do a few node heartbeats to make sure 
that the scheduler will try to allocate new container (for the outstanding 
request) on that node and fail to do so (+ not hit any NPE). Then after the 
first container completes, the test should check that the 2GB outstanding 
container request is satisfiied but not the 3GB request. Thereafter, complete 
the second container and verify that the 3GB requests is still not satisfied 
(because the NM has only 2GB resource.

Which brings another question to my mind. What happens when this change 
resource command reduces the NM size to something less than the max container 
size allowed. e.g. Lets say that the NM is 8GB and max allowed container size 
is 4GB. So the RM accepts a 4GB request. Now the admin changes NM to 2GB. At 
this point the previously accepted 4GB request cannot be satisfied and the 
application will get stuck. We may need to follow this up in a different jira. 
There may be some existing jiras related to max container size and actual NM 
resource size.

bq. Update transition as only 2 lines of code can be shared and shared a method 
across different class seems over-kill in this case
Its probably about personal stylistic choices. The number of lines of code are 
less important. What is more important is that having a shared method declares 
that these pieces of code are related to each other in a logical way (if such a 
relation exists). The dependency may be via a method in RMNodeImpl thats called 
by both transitions or one transition extending the other one. The choice 
depends on how the 2 pieces of code are related to each other.

> Replace set resource change on RMNode/SchedulerNode directly with event 
> notification.
> -
>
> Key: YARN-1506
> URL: https://issues.apache.org/jira/browse/YARN-1506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-1506-v1.patch, YARN-1506-v10.patch, 
> YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, 
> YARN-1506-v5.patch, YARN-1506-v6.patch, YARN-1506-v7.patch, 
> YARN-1506-v8.patch, YARN-1506-v9.patch
>
>
> According to Vinod's comments on YARN-312 
> (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
>  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-1970:


Attachment: YARN-1970.1.patch

> Prepare YARN codebase for JUnit 4.11.
> -
>
> Key: YARN-1970
> URL: https://issues.apache.org/jira/browse/YARN-1970
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: YARN-1970.1.patch
>
>
> HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
> YARN code needs some minor updates to fix deprecation warnings and test 
> isolation problems before the upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1970) Prepare YARN codebase for JUnit 4.11.

2014-04-21 Thread Chris Nauroth (JIRA)
Chris Nauroth created YARN-1970:
---

 Summary: Prepare YARN codebase for JUnit 4.11.
 Key: YARN-1970
 URL: https://issues.apache.org/jira/browse/YARN-1970
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.4.0, 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-1970.1.patch

HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the 
YARN code needs some minor updates to fix deprecation warnings and test 
isolation problems before the upgrade.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975885#comment-13975885
 ] 

Mohammad Kamrul Islam commented on YARN-1962:
-

[~zjshen] Thanks for the feedback.
I will upload a new patch.


> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketIm

[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975790#comment-13975790
 ] 

Vinod Kumar Vavilapalli commented on YARN-1964:
---

YARN already has platform specific plugins like LinuxContainerExecutor, this 
would just be one more option.

> Support Docker containers in YARN
> -
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1969) Earliest Deadline First Scheduling in the Fair Scheduler

2014-04-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1969:
-

Summary: Earliest Deadline First Scheduling in the Fair Scheduler  (was: 
Earliest Deadline First Scheduling)

> Earliest Deadline First Scheduling in the Fair Scheduler
> 
>
> Key: YARN-1969
> URL: https://issues.apache.org/jira/browse/YARN-1969
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
>
> What we are observing is that some big jobs with many allocated containers 
> are waiting for a few containers to finish. Under *fair-share scheduling* 
> however they have a low priority since there are other jobs (usually much 
> smaller, new comers) that are using resources way below their fair share, 
> hence new released containers are not offered to the big, yet 
> close-to-be-finished job. Nevertheless, everybody would benefit from an 
> "unfair" scheduling that offers the resource to the big job since the sooner 
> the big job finishes, the sooner it releases its "many" allocated resources 
> to be used by other jobs.In other words, what we require is a kind of 
> variation of *Earliest Deadline First scheduling*, that takes into account 
> the number of already-allocated resources and estimated time to finish.
> http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
> For example, if a job is using MEM GB of memory and is expected to finish in 
> TIME minutes, the priority in scheduling would be a function p of (MEM, 
> TIME). The expected time to finish can be estimated by the AppMaster using 
> TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
> request messages. To be less susceptible to the issue of apps gaming the 
> system, we can have this scheduling limited to *only within a queue*: i.e., 
> adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
> to use it by setting the "schedulingPolicy" field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1969) Earliest Deadline First Scheduling

2014-04-21 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975776#comment-13975776
 ] 

Maysam Yabandeh commented on YARN-1969:
---

An example of this behavior is when a job preempts its reducers to free space 
for its mappers. The freed space is however first offered to the app the has 
already made a reservation on the node. And then it is offered to the queues 
that are using lower than their fair share, it then it is offered to the queue 
to which the app belongs, and at the end it is offered to the app that released 
the resource in the first place. Note that preemption is just one example and 
we observe similar inefficiencies when preemption is not involved.
There are already open jiras that could alleviate the problem. e.g., if 
YARN-1197 is finished the MRAppMaster can reuse the reducer's container instead 
of returning it to RM. Or YARN-1404 would allow for a more flexible scheduling 
for individual apps. Nevertheless it seems to us augmenting the fair-schedular 
to take such priorities into account addresses the problem in a more general 
fashion.

I would highly appreciate your feedback.

> Earliest Deadline First Scheduling
> --
>
> Key: YARN-1969
> URL: https://issues.apache.org/jira/browse/YARN-1969
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
>
> What we are observing is that some big jobs with many allocated containers 
> are waiting for a few containers to finish. Under *fair-share scheduling* 
> however they have a low priority since there are other jobs (usually much 
> smaller, new comers) that are using resources way below their fair share, 
> hence new released containers are not offered to the big, yet 
> close-to-be-finished job. Nevertheless, everybody would benefit from an 
> "unfair" scheduling that offers the resource to the big job since the sooner 
> the big job finishes, the sooner it releases its "many" allocated resources 
> to be used by other jobs.In other words, what we require is a kind of 
> variation of *Earliest Deadline First scheduling*, that takes into account 
> the number of already-allocated resources and estimated time to finish.
> http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
> For example, if a job is using MEM GB of memory and is expected to finish in 
> TIME minutes, the priority in scheduling would be a function p of (MEM, 
> TIME). The expected time to finish can be estimated by the AppMaster using 
> TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
> request messages. To be less susceptible to the issue of apps gaming the 
> system, we can have this scheduling limited to *only within a queue*: i.e., 
> adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
> to use it by setting the "schedulingPolicy" field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1969) Earliest Deadline First Scheduling

2014-04-21 Thread Maysam Yabandeh (JIRA)
Maysam Yabandeh created YARN-1969:
-

 Summary: Earliest Deadline First Scheduling
 Key: YARN-1969
 URL: https://issues.apache.org/jira/browse/YARN-1969
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh


What we are observing is that some big jobs with many allocated containers are 
waiting for a few containers to finish. Under *fair-share scheduling* however 
they have a low priority since there are other jobs (usually much smaller, new 
comers) that are using resources way below their fair share, hence new released 
containers are not offered to the big, yet close-to-be-finished job. 
Nevertheless, everybody would benefit from an "unfair" scheduling that offers 
the resource to the big job since the sooner the big job finishes, the sooner 
it releases its "many" allocated resources to be used by other jobs.In other 
words, what we require is a kind of variation of *Earliest Deadline First 
scheduling*, that takes into account the number of already-allocated resources 
and estimated time to finish.
http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling

For example, if a job is using MEM GB of memory and is expected to finish in 
TIME minutes, the priority in scheduling would be a function p of (MEM, TIME). 
The expected time to finish can be estimated by the AppMaster using 
TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
request messages. To be less susceptible to the issue of apps gaming the 
system, we can have this scheduling limited to *only within a queue*: i.e., 
adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues to 
use it by setting the "schedulingPolicy" field.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1968) YARN Admin service should have more fine-grained ACL which is based on mapping of users with methods/operations.

2014-04-21 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1968:
-

Description: 
AdminService's operation today have different dimensions of management, some 
are on user management while others are on cluster management, etc. 
Today, we only check if user belongs to some authorized group to see if he can 
execute operations in admin service. The result is who can either execute all 
operations or none which is a simple strategy but not very precisely so we 
cannot separate different management roles to several admins. We may need more 
fine-grained ACLs which can authorized user with partial operations in 
AdminService.

  was:
AdminService's operation today have different dimensions of management, some is 
on user management while other is on cluster management. 
Today, we only check if user belongs to some authorized group to see if he can 
execute operations in admin service. The result is he can either execute all 
operations or none which is a simple strategy but not very precisely. We may 
need more fine-grained ACLs which can authorized user with partial operations 
in AdminService.


> YARN Admin service should have more fine-grained ACL which is based on 
> mapping of users with methods/operations.
> 
>
> Key: YARN-1968
> URL: https://issues.apache.org/jira/browse/YARN-1968
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>
> AdminService's operation today have different dimensions of management, some 
> are on user management while others are on cluster management, etc. 
> Today, we only check if user belongs to some authorized group to see if he 
> can execute operations in admin service. The result is who can either execute 
> all operations or none which is a simple strategy but not very precisely so 
> we cannot separate different management roles to several admins. We may need 
> more fine-grained ACLs which can authorized user with partial operations in 
> AdminService.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1968) YARN Admin service should have more fine-grained ACL which is based on mapping of users with methods/operations.

2014-04-21 Thread Junping Du (JIRA)
Junping Du created YARN-1968:


 Summary: YARN Admin service should have more fine-grained ACL 
which is based on mapping of users with methods/operations.
 Key: YARN-1968
 URL: https://issues.apache.org/jira/browse/YARN-1968
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du


AdminService's operation today have different dimensions of management, some is 
on user management while other is on cluster management. 
Today, we only check if user belongs to some authorized group to see if he can 
execute operations in admin service. The result is he can either execute all 
operations or none which is a simple strategy but not very precisely. We may 
need more fine-grained ACLs which can authorized user with partial operations 
in AdminService.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975708#comment-13975708
 ] 

Zhijie Shen commented on YARN-1962:
---

[~kamrul], thanks for the patch. I've some comments on it:

1. You need to change the default in yarn-default.xml as well.
{code}
yarn.timeline-service.enabled
true
{code}

2. By doing this, you probably need to update other test cases to make the 
timeline client enabled. Please search through all the calls of TimelineClient

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketCon

[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975687#comment-13975687
 ] 

Hadoop QA commented on YARN-1954:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641071/YARN-1954.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3601//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3601//console

This message is automatically generated.

> Add waitFor to AMRMClient(Async)
> 
>
> Key: YARN-1954
> URL: https://issues.apache.org/jira/browse/YARN-1954
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch
>
>
> Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
> that the main non-daemon thread has to sit in a dummy loop to prevent AM 
> process exiting before all the tasks are done, while unregistration is 
> triggered on a separate another daemon thread by callback methods (in 
> particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
> a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
> user supplied check point, such that users don't need to write the loop 
> themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975658#comment-13975658
 ] 

Tsuyoshi OZAWA commented on YARN-1954:
--

Thank you for the comment, [~zjshen]. Updated a patch:

1. Changed APIs to wait infinitely.
2. Added log in main loop. I concerned that it can be overhead to log too much. 
Should we add logging interval as a new parameter?
3. Created YARN-1967 to address this issue.
4. Added AMRMClient support.
5. Added methods to test waitFor() in another methods.

> Add waitFor to AMRMClient(Async)
> 
>
> Key: YARN-1954
> URL: https://issues.apache.org/jira/browse/YARN-1954
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch
>
>
> Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
> that the main non-daemon thread has to sit in a dummy loop to prevent AM 
> process exiting before all the tasks are done, while unregistration is 
> triggered on a separate another daemon thread by callback methods (in 
> particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
> a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
> user supplied check point, such that users don't need to write the loop 
> themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1954:
-

Attachment: YARN-1954.3.patch

> Add waitFor to AMRMClient(Async)
> 
>
> Key: YARN-1954
> URL: https://issues.apache.org/jira/browse/YARN-1954
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: client
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch
>
>
> Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
> that the main non-daemon thread has to sit in a dummy loop to prevent AM 
> process exiting before all the tasks are done, while unregistration is 
> triggered on a separate another daemon thread by callback methods (in 
> particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
> a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
> user supplied check point, such that users don't need to write the loop 
> themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975610#comment-13975610
 ] 

Tsuyoshi OZAWA commented on YARN-1964:
--

Do you mean we will have DockerContainerExecutor instead of 
LinuxContainerExecutor?

> Support Docker containers in YARN
> -
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1964) Support Docker containers in YARN

2014-04-21 Thread jay vyas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975599#comment-13975599
 ] 

jay vyas commented on YARN-1964:


This takes us away from the java idiom of packaging apps as jars  Its a 
pretty bold step so let me play devils advocate, because I think it might not 
be the best idea.

1) Tying yarn's JVM NodeManagers to LCE's adds a new dependency to the YARN 
stack.adds new  building/compiling/maintaining costs.

2) NodeManagers are being run in LinuxContainers quite commonly.  This could 
lead to containers, launching NM's, which again launch containers.Seems 
kind of yucky dont you think?

3) It forces yarn to be aware of linux containers. This might lead to  
encouraging the creation of application code that doesnt run easily outside of 
containerized environment.  As awesome as LCE's are, most java and bigdata apps 
running in YARN with the hadoop .

> Support Docker containers in YARN
> -
>
> Key: YARN-1964
> URL: https://issues.apache.org/jira/browse/YARN-1964
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> Docker (https://www.docker.io/) is, increasingly, a very popular container 
> technology.
> In context of YARN, the support for Docker will provide a very elegant 
> solution to allow applications to *package* their software into a Docker 
> container (entire Linux file system incl. custom versions of perl, python 
> etc.) and use it as a blueprint to launch all their YARN containers with 
> requisite software environment. This provides both consistency (all YARN 
> containers will have the same software environment) and isolation (no 
> interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-1966.
--

Resolution: Duplicate

This is a duplicate of YARN-1269 and related to YARN-1941 and YARN-1951.

In any case I don't think we want to special-case the root queue, as the same 
issue could exist in a subtree where access to the subtree root allows access 
to any queue within the subtree.

Actually I believe this is by design.  It allows admins to configure access to 
an entire subtree of queues by giving access to the root of the subtree rather 
than having to add the access to each leaf queue.  So for your example above 
you'll want to set the root queue's ACLs to be empty so that one must have 
access to the leaf queue in order to submit.

> Capacity Scheduler acl_submit_applications in Leaf Queue finally considers 
> root queue default always
> 
>
> Key: YARN-1966
> URL: https://issues.apache.org/jira/browse/YARN-1966
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Sunil G
> Attachments: Yarn-1966.1.patch
>
>
> Given with below configurations,
> 
>   yarn.scheduler.capacity.root.queues
>   fast,medium
> 
> 
>   yarn.scheduler.capacity.root.fast.acl_submit_applications
>   hadoop
> 
> 
>   yarn.scheduler.capacity.root.slow.acl_submit_applications
>   hadoop
> 
> In this case, the expectation is like "hadoop" user can only submit job to 
> "fast" or "slow" queue.
> But now any user can submit job to these queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1967) onShutdownRequest() is not called when AMRMClientAsyncImpl#unregisterApplicationMaster() is called

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)
Tsuyoshi OZAWA created YARN-1967:


 Summary: onShutdownRequest() is not called when 
AMRMClientAsyncImpl#unregisterApplicationMaster() is called
 Key: YARN-1967
 URL: https://issues.apache.org/jira/browse/YARN-1967
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA


When I taking on YARN-1954, I found that onShutdownRequest() is not called when 
AMRMClientAsyncImpl#unregisterApplicationMaster() is called.
Should we fix it by calling onShutdownRequest() or adding new 
hook(onUnregisteredApplicationRequest) to CallbackHandler?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1657) Timeout occurs in TestNMClient

2014-04-21 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved YARN-1657.
-

Resolution: Cannot Reproduce

> Timeout occurs in TestNMClient
> --
>
> Key: YARN-1657
> URL: https://issues.apache.org/jira/browse/YARN-1657
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Akira AJISAKA
>
> A timeout occurs in TestNMClient when a patch is tested by Jenkins.
> The following comment can be seen in YARN-1480, YARN-1611, and YARN-888.
> {code}
> {color:red}-1 core tests{color}.  The following test timeouts occurred in 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:
> org.apache.hadoop.yarn.client.api.impl.TestNMClient
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1657) Timeout occurs in TestNMClient

2014-04-21 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975543#comment-13975543
 ] 

Akira AJISAKA commented on YARN-1657:
-

Okay, closing this JIRA. I'll reopen this if the timeout occurs again.

> Timeout occurs in TestNMClient
> --
>
> Key: YARN-1657
> URL: https://issues.apache.org/jira/browse/YARN-1657
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Akira AJISAKA
>
> A timeout occurs in TestNMClient when a patch is tested by Jenkins.
> The following comment can be seen in YARN-1480, YARN-1611, and YARN-888.
> {code}
> {color:red}-1 core tests{color}.  The following test timeouts occurred in 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:
> org.apache.hadoop.yarn.client.api.impl.TestNMClient
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1657) Timeout occurs in TestNMClient

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975542#comment-13975542
 ] 

Tsuyoshi OZAWA commented on YARN-1657:
--

I cannot reproduce the problem too. [~ajisakaa], how about closing this JIRA 
and reopen it when we face the problem?

> Timeout occurs in TestNMClient
> --
>
> Key: YARN-1657
> URL: https://issues.apache.org/jira/browse/YARN-1657
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Akira AJISAKA
>
> A timeout occurs in TestNMClient when a patch is tested by Jenkins.
> The following comment can be seen in YARN-1480, YARN-1611, and YARN-888.
> {code}
> {color:red}-1 core tests{color}.  The following test timeouts occurred in 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:
> org.apache.hadoop.yarn.client.api.impl.TestNMClient
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975541#comment-13975541
 ] 

Tsuyoshi OZAWA commented on YARN-556:
-

[~adhoot], I glanced over your patch.

1. Can you split your code into each subtasks? Your patch includes overall 
changes of this task. We should discuss small points on each subtask JIRA.
2. IMO, prototype is enough to validate the design. Do you have any additional 
comments about design docs?

I'd like to include this feature in 2.5.0(maybe May - June?), so let's work 
togather :-)

> RM Restart phase 2 - Work preserving restart
> 
>
> Key: YARN-556
> URL: https://issues.apache.org/jira/browse/YARN-556
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: Work Preserving RM Restart.pdf, 
> WorkPreservingRestartPrototype.001.patch
>
>
> YARN-128 covered storing the state needed for the RM to recover critical 
> information. This umbrella jira will track changes needed to recover the 
> running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-1966:
--

Attachment: Yarn-1966.1.patch

Here issue is happening because while submitting an application, hasAccess() 
check will always reach ParentQueue::hasAccess(). 
In this case, finally parent will come as "root" and it will pass this check. 
(* is default for acl_submit_applications and acl_administer_queue in root 
queue)

So to ensure only one specified user to submit job in a leaf queue, below 
configurations are mandatory in "root"
  root.acl_submit_applications 
  root.acl_administer_queue

To submit a job, acl_administer_queue check has no relevance. But we are forced 
to configure this also, if we want to achieve what is mentioned in the problem 
statement of this issue. 
Also if each leaf queue wants to have its own set of users, all users finally 
are to be mentioned in root. This is not good.

So it is better to skip hasAccess() check if parent Queue is "root" as below
if(rootQueue){
   return false;
}

> Capacity Scheduler acl_submit_applications in Leaf Queue finally considers 
> root queue default always
> 
>
> Key: YARN-1966
> URL: https://issues.apache.org/jira/browse/YARN-1966
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Sunil G
> Attachments: Yarn-1966.1.patch
>
>
> Given with below configurations,
> 
>   yarn.scheduler.capacity.root.queues
>   fast,medium
> 
> 
>   yarn.scheduler.capacity.root.fast.acl_submit_applications
>   hadoop
> 
> 
>   yarn.scheduler.capacity.root.slow.acl_submit_applications
>   hadoop
> 
> In this case, the expectation is like "hadoop" user can only submit job to 
> "fast" or "slow" queue.
> But now any user can submit job to these queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1966) Capacity Scheduler acl_submit_applications in Leaf Queue finally considers root queue default always

2014-04-21 Thread Sunil G (JIRA)
Sunil G created YARN-1966:
-

 Summary: Capacity Scheduler acl_submit_applications in Leaf Queue 
finally considers root queue default always
 Key: YARN-1966
 URL: https://issues.apache.org/jira/browse/YARN-1966
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Sunil G


Given with below configurations,

  yarn.scheduler.capacity.root.queues
  fast,medium


  yarn.scheduler.capacity.root.fast.acl_submit_applications
  hadoop


  yarn.scheduler.capacity.root.slow.acl_submit_applications
  hadoop


In this case, the expectation is like "hadoop" user can only submit job to 
"fast" or "slow" queue.
But now any user can submit job to these queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975504#comment-13975504
 ] 

Hadoop QA commented on YARN-1962:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641040/YARN-1962.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3600//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3600//console

This message is automatically generated.

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>  

[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol

2014-04-21 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975499#comment-13975499
 ] 

Tsuyoshi OZAWA commented on YARN-1879:
--

[~xgong] and [~jianhe], do you have additional comments or opinions? Let me 
know if you have unclear points.

> Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
> ---
>
> Key: YARN-1879
> URL: https://issues.apache.org/jira/browse/YARN-1879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
>Priority: Critical
> Attachments: YARN-1879.1.patch, YARN-1879.1.patch, 
> YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.3.patch, 
> YARN-1879.4.patch, YARN-1879.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-313) Add Admin API for supporting node resource configuration in command line

2014-04-21 Thread Kenji Kikushima (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenji Kikushima updated YARN-313:
-

Attachment: YARN-313-sample.patch

Hi, I implemented refreshResources option as a prototype. I know this ticket is 
already assigned. My aim is to bring forward, not to interfere. Please refer to 
this patch if you have interest. Thanks.
- This patch contains RefreshResourcesRequest and RefreshResourcesResponse to 
call updateNodeResource from client indirectly.
- DynamicResourceConfiguration class and dynamic-resources.xml are introduced 
to configure resource options.


> Add Admin API for supporting node resource configuration in command line
> 
>
> Key: YARN-313
> URL: https://issues.apache.org/jira/browse/YARN-313
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-313-sample.patch
>
>
> We should provide some admin interface, e.g. "yarn rmadmin -refreshResources" 
> to support changes of node's resource specified in a config file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-996) REST API support for node resource configuration

2014-04-21 Thread Kenji Kikushima (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenji Kikushima updated YARN-996:
-

Attachment: YARN-996-2.patch

Updated for YARN-1949. (Added rm.start() in tests.)

> REST API support for node resource configuration
> 
>
> Key: YARN-996
> URL: https://issues.apache.org/jira/browse/YARN-996
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Kenji Kikushima
> Attachments: YARN-996-2.patch, YARN-996-sample.patch
>
>
> Besides admin protocol and CLI, REST API should also be supported for node 
> resource configuration



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1962) Timeline server is enabled by default

2014-04-21 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated YARN-1962:


Attachment: YARN-1962.1.patch

Thanks [~vinodkv].

Patch added

> Timeline server is enabled by default
> -
>
> Key: YARN-1962
> URL: https://issues.apache.org/jira/browse/YARN-1962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Attachments: YARN-1962.1.patch
>
>
> Since Timeline server is not matured and secured yet, enabling  it by default 
> might create some confusion.
> We were playing with 2.4.0 and found a lot of exceptions for distributed 
> shell example related to connection refused error. Btw, we didn't run TS 
> because it is not secured yet.
> Although it is possible to explicitly turn it off through yarn-site config. 
> In my opinion,  this extra change for this new service is not worthy at this 
> point,.  
> This JIRA is to turn it off by default.
> If there is an agreement, i can put a simple patch about this.
> {noformat}
> 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response 
> from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:579)
>   at java.net.Socket.connect(Socket.java:528)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient. impl.TimelineClientImpl: Failed to get the response from the timeline server.
> com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: 
> Connection refused
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281)
> Caused by: java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>