[jira] [Updated] (YARN-457) Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl

2013-04-03 Thread Kenji Kikushima (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenji Kikushima updated YARN-457:
-

Attachment: YARN-457-2.patch

Sorry. I changed to call initLocalNewNodeReportList() before clearing 
this.updatedNodes.

 Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl
 

 Key: YARN-457
 URL: https://issues.apache.org/jira/browse/YARN-457
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Kenji Kikushima
Priority: Minor
  Labels: Newbie
 Attachments: YARN-457-2.patch, YARN-457.patch


 {code}
 if (updatedNodes == null) {
   this.updatedNodes.clear();
   return;
 }
 {code}
 If updatedNodes is already null, a NullPointerException is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality

2013-04-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620705#comment-13620705
 ] 

Sandy Ryza commented on YARN-392:
-

Ok, I will work on a patch for the non-blacklist proposal. To clarify, should 
location-specific requests be able to coexist with non-location-specific 
requests at the same priority? 


 Make it possible to schedule to specific nodes without dropping locality
 

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-527) Local filecache mkdir fails

2013-04-03 Thread Knut O. Hellan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620711#comment-13620711
 ] 

Knut O. Hellan commented on YARN-527:
-

There is really no difference in how the directories are created. What probably 
happened under the hood was that the file system reached maximum number of 
files in the filecache directory. This maximum size is 32000 since we use EXT3. 
I don't have the exact numbers for any of the disks from my checks, but i 
remember seeing above 30k some places. The reason we were able to manually 
create directories might be that there was some automatic cleanup happening. 
Does YARN clean the file cache?

 Local filecache mkdir fails
 ---

 Key: YARN-527
 URL: https://issues.apache.org/jira/browse/YARN-527
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.0-alpha
 Environment: RHEL 6.3 with CDH4.1.3 Hadoop, HA with two name nodes 
 and six worker nodes.
Reporter: Knut O. Hellan
Priority: Minor
 Attachments: yarn-site.xml


 Jobs failed with no other explanation than this stack trace:
 2013-03-29 16:46:02,671 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diag
 nostics report from attempt_1364591875320_0017_m_00_0: 
 java.io.IOException: mkdir of /disk3/yarn/local/filecache/-42307893
 55400878397 failed
 at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:932)
 at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
 at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
 at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
 at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Manually creating the directory worked. This behavior was common to at least 
 several nodes in the cluster.
 The situation was resolved by removing and recreating all 
 /disk?/yarn/local/filecache directories on all nodes.
 It is unclear whether Yarn struggled with the number of files or if there 
 were corrupt files in the caches. The situation was triggered by a node dying.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-425) coverage fix for yarn api

2013-04-03 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-425:
--

Attachment: YARN-425-branch-2-b.patch

 coverage fix for yarn api
 -

 Key: YARN-425
 URL: https://issues.apache.org/jira/browse/YARN-425
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-425-branch-0.23.patch, YARN-425-branch-2-b.patch, 
 YARN-425-branch-2.patch, YARN-425-trunk-a.patch, YARN-425-trunk-b.patch, 
 YARN-425-trunk.patch


 coverage fix for yarn api
 patch YARN-425-trunk-a.patch for trunk
 patch YARN-425-branch-2.patch for branch-2
 patch YARN-425-branch-0.23.patch for branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-425) coverage fix for yarn api

2013-04-03 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-425:
--

Attachment: YARN-425-trunk-b.patch

 coverage fix for yarn api
 -

 Key: YARN-425
 URL: https://issues.apache.org/jira/browse/YARN-425
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-425-branch-0.23.patch, YARN-425-branch-2-b.patch, 
 YARN-425-branch-2.patch, YARN-425-trunk-a.patch, YARN-425-trunk-b.patch, 
 YARN-425-trunk.patch


 coverage fix for yarn api
 patch YARN-425-trunk-a.patch for trunk
 patch YARN-425-branch-2.patch for branch-2
 patch YARN-425-branch-0.23.patch for branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-425) coverage fix for yarn api

2013-04-03 Thread Aleksey Gorshkov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620843#comment-13620843
 ] 

Aleksey Gorshkov commented on YARN-425:
---


update patch for trunk :YARN-425-trunk-b.patch
and for branch-2 YARN-425-branch-2-b.patch



 coverage fix for yarn api
 -

 Key: YARN-425
 URL: https://issues.apache.org/jira/browse/YARN-425
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-425-branch-0.23.patch, YARN-425-branch-2-b.patch, 
 YARN-425-branch-2.patch, YARN-425-trunk-a.patch, YARN-425-trunk-b.patch, 
 YARN-425-trunk.patch


 coverage fix for yarn api
 patch YARN-425-trunk-a.patch for trunk
 patch YARN-425-branch-2.patch for branch-2
 patch YARN-425-branch-0.23.patch for branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-427) Coverage fix for org.apache.hadoop.yarn.server.api.*

2013-04-03 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-427:
--

Attachment: YARN-427-trunk-a.patch
YARN-427-branch-2-a.patch

 Coverage fix for org.apache.hadoop.yarn.server.api.*
 

 Key: YARN-427
 URL: https://issues.apache.org/jira/browse/YARN-427
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-427-branch-2-a.patch, YARN-427-branch-2.patch, 
 YARN-427-trunk-a.patch, YARN-427-trunk.patch


 Coverage fix for org.apache.hadoop.yarn.server.api.*
 patch YARN-427-trunk.patch for trunk
 patch YARN-427-branch-2.patch for branch-2 and branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-427) Coverage fix for org.apache.hadoop.yarn.server.api.*

2013-04-03 Thread Aleksey Gorshkov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620845#comment-13620845
 ] 

Aleksey Gorshkov commented on YARN-427:
---

patch was update
patch YARN-427-trunk-a.patch for trunk
patch YARN-427-branch-2-a.patch for branch-2 and branch-0.23

 Coverage fix for org.apache.hadoop.yarn.server.api.*
 

 Key: YARN-427
 URL: https://issues.apache.org/jira/browse/YARN-427
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-427-branch-2-a.patch, YARN-427-branch-2.patch, 
 YARN-427-trunk-a.patch, YARN-427-trunk.patch


 Coverage fix for org.apache.hadoop.yarn.server.api.*
 patch YARN-427-trunk.patch for trunk
 patch YARN-427-branch-2.patch for branch-2 and branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-04-03 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-465:
--

Attachment: YARN-465-branch-2-a.patch
YARN-465-branch-0.23-a.patch

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-04-03 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-465:
--

Attachment: YARN-465-trunk-a.patch

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-04-03 Thread Aleksey Gorshkov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620914#comment-13620914
 ] 

Aleksey Gorshkov commented on YARN-465:
---

patches updated

patch YARN-465-trunk-a.patch for trunk
patch YARN-465-branch-2-a.patch for branch-2
patch YARN-465-branch-0.23-a.patch for branch-0.23

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-117) Enhance YARN service model

2013-04-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620913#comment-13620913
 ] 

Steve Loughran commented on YARN-117:
-

I'm not seeing all those tests failing locally, only 
{{TestUnmanagedAMLauncher}} and {{TestNMExpiry}}.

{code}
testNMExpiry(org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry)
  Time elapsed: 2797 sec   FAILURE!
junit.framework.AssertionFailedError: expected:2 but was:0
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.failNotEquals(Assert.java:283)
at junit.framework.Assert.assertEquals(Assert.java:64)
at junit.framework.Assert.assertEquals(Assert.java:195)
at junit.framework.Assert.assertEquals(Assert.java:201)
at 
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry.testNMExpiry(TestNMExpiry.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
{code}

and 
{code}
Running 
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.579 sec  
FAILURE!
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher 
 Time elapsed: 579 sec   ERROR!
org.apache.hadoop.yarn.YarnException: could not cleanup test dir: 
java.lang.RuntimeException: Error parsing 'yarn-site.xml' : 
org.xml.sax.SAXParseException: Premature end of file.
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.init(MiniYARNCluster.java:95)
at 
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.setup(TestUnmanagedAMLauncher.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)



 Enhance YARN service model
 --

 Key: YARN-117
 URL: https://issues.apache.org/jira/browse/YARN-117
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-117.patch


 Having played the YARN service model, there are some issues
 that I've identified based on past work and initial use.
 This JIRA issue is an overall one to cover the issues, with solutions pushed 
 out to separate JIRAs.
 h2. state model prevents stopped state being entered if you could not 
 successfully start the service.
 In the current lifecycle you cannot stop a service unless it was successfully 
 started, but
 * {{init()}} may acquire resources that need to be explicitly released
 * if the {{start()}} operation fails partway through, the {{stop()}} 
 operation may be needed to release resources.
 *Fix:* make {{stop()}} a valid state transition from all states and require 
 the implementations to be able to stop safely without requiring all fields 

[jira] [Created] (YARN-535) TestUnmanagedAMLauncher can corrupt target/test-classes/yarn-site.xml during write phase, breaks later test runs

2013-04-03 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-535:
---

 Summary: TestUnmanagedAMLauncher can corrupt 
target/test-classes/yarn-site.xml during write phase, breaks later test runs
 Key: YARN-535
 URL: https://issues.apache.org/jira/browse/YARN-535
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications
Affects Versions: 3.0.0
 Environment: OS/X laptop, HFS+ filesystem
Reporter: Steve Loughran
Priority: Minor


the setup phase of {{TestUnmanagedAMLauncher}} overwrites {{yarn-site.xml}}. As 
{{Configuration.writeXml()}} does a reread of all resources, this will break if 
the (open-for-writing) resource is already visible as an empty file. 

This leaves a corrupted {{target/test-classes/yarn-site.xml}}, which breaks 
later test runs -because it is not overwritten by later incremental builds, due 
to timestamps.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-535) TestUnmanagedAMLauncher can corrupt target/test-classes/yarn-site.xml during write phase, breaks later test runs

2013-04-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620974#comment-13620974
 ] 

Steve Loughran commented on YARN-535:
-

{code}
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher 
 Time elapsed: 4137 sec   ERROR!
java.lang.RuntimeException: Error parsing 'yarn-site.xml' : 
org.xml.sax.SAXParseException: Premature end of file.
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2050)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1899)
at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1816)
at 
org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:465)
at 
org.apache.hadoop.conf.Configuration.asXmlDocument(Configuration.java:2127)
at 
org.apache.hadoop.conf.Configuration.writeXml(Configuration.java:2096)
at 
org.apache.hadoop.conf.Configuration.writeXml(Configuration.java:2086)
at 
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.setup(TestUnmanagedAMLauncher.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:246)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:153)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:1887)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:1875)
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1946)
... 29 more
{code}
This stack trace is a failure to read the file yarn-site.xml, which is actually 
being written on line 63 of TestUnmanagedAMLauncher -a file that 
is already open for writing. 

It is possible that some filesystems (here, HFS+) make that write visible while 
it is still
going on, triggering a failure which then corrupts later builds at init time

{code}
$ ls -l target/test-classes/yarn-site.xml 
-rw-r--r--  1 stevel  staff  0  3 Apr 15:37 target/test-classes/yarn-site.xml
{code}

This is newer than the one in test/properties, so Maven doesn't fix it next 
test run
{code}
$ ls -l src/test/resources/yarn-site.xml 
-rw-r--r--@ 1 stevel  staff  830 28 Nov 16:29 src/test/resources/yarn-site.xml
{code}
as a result, follow on tests fail when MiniYARNCluster tries to read it.

{code}
org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher 
 Time elapsed: 515 sec   ERROR!
java.lang.RuntimeException: Error parsing 'yarn-site.xml' : 
org.xml.sax.SAXParseException: Premature end of file.
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2050)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1899)
at 

[jira] [Commented] (YARN-528) Make IDs read only

2013-04-03 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621023#comment-13621023
 ] 

Robert Joseph Evans commented on YARN-528:
--

OK, I understand now.

I will try to find some time to play around with getting the AM ID to not have 
a wrapper at all.

 Make IDs read only
 --

 Key: YARN-528
 URL: https://issues.apache.org/jira/browse/YARN-528
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: YARN-528.txt, YARN-528.txt


 I really would like to rip out most if not all of the abstraction layer that 
 sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
 no plans to support any other serialization type, and the abstraction layer 
 just, makes it more difficult to change protocols, makes changing them more 
 error prone, and slows down the objects themselves.  
 Completely doing that is a lot of work.  This JIRA is a first step towards 
 that.  It makes the various ID objects immutable.  If this patch is wel 
 received I will try to go through other objects/classes of objects and update 
 them in a similar way.
 This is probably the last time we will be able to make a change like this 
 before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Roger Hoover (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Hoover updated YARN-412:
--

Attachment: (was: YARN-412.patch)

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-527) Local filecache mkdir fails

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621042#comment-13621042
 ] 

Vinod Kumar Vavilapalli commented on YARN-527:
--

If it is the 32K limit that caused it, the timing can't be more perfect. I just 
committed YARN-467 which addresses it for public cache, and YARN-99 is in 
progress which takes care of private cache. These two JIRAs enforce a limit in 
YARN itself, default is 8192.

Looking back again at your stack trace, I agree that it is very likely you are 
hitting the 32K limit.

Can I close this as a duplicate of YARN-467? You can verify the fix on 
2.0.5-beta when it is out.

 Local filecache mkdir fails
 ---

 Key: YARN-527
 URL: https://issues.apache.org/jira/browse/YARN-527
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.0-alpha
 Environment: RHEL 6.3 with CDH4.1.3 Hadoop, HA with two name nodes 
 and six worker nodes.
Reporter: Knut O. Hellan
Priority: Minor
 Attachments: yarn-site.xml


 Jobs failed with no other explanation than this stack trace:
 2013-03-29 16:46:02,671 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diag
 nostics report from attempt_1364591875320_0017_m_00_0: 
 java.io.IOException: mkdir of /disk3/yarn/local/filecache/-42307893
 55400878397 failed
 at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:932)
 at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
 at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
 at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
 at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Manually creating the directory worked. This behavior was common to at least 
 several nodes in the cluster.
 The situation was resolved by removing and recreating all 
 /disk?/yarn/local/filecache directories on all nodes.
 It is unclear whether Yarn struggled with the number of files or if there 
 were corrupt files in the caches. The situation was triggered by a node dying.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-430) Add HDFS based store for RM which manages the store using directories

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-430:
-

Summary: Add HDFS based store for RM which manages the store using 
directories  (was: Add HDFS based store for RM)

 Add HDFS based store for RM which manages the store using directories
 -

 Key: YARN-430
 URL: https://issues.apache.org/jira/browse/YARN-430
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He

 There is a generic FileSystem store but it does not take advantage of HDFS 
 features like directories, replication, DFSClient advanced settings for HA, 
 retries etc. Writing a store thats optimized for HDFS would be good.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-248) Security related work for RM restart

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-248:
-

Summary: Security related work for RM restart  (was: Restore 
RMDelegationTokenSecretManager state on restart)

 Security related work for RM restart
 

 Key: YARN-248
 URL: https://issues.apache.org/jira/browse/YARN-248
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tom White
Assignee: Bikas Saha

 On restart, the RM creates a new RMDelegationTokenSecretManager with fresh 
 state. This will cause problems for Oozie jobs running on secure clusters 
 since the delegation tokens stored in the job credentials (used by the Oozie 
 launcher job to submit a job to the RM) will not be recognized by the RM, and 
 recovery will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-458) YARN daemon addresses must be placed in many different configs

2013-04-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621114#comment-13621114
 ] 

Sandy Ryza commented on YARN-458:
-

Verified on a pseudo-distributed cluster that both the old and new configs work.

 YARN daemon addresses must be placed in many different configs
 --

 Key: YARN-458
 URL: https://issues.apache.org/jira/browse/YARN-458
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-458.patch


 The YARN resourcemanager's address is included in four different configs: 
 yarn.resourcemanager.scheduler.address, 
 yarn.resourcemanager.resource-tracker.address, yarn.resourcemanager.address, 
 and yarn.resourcemanager.admin.address
 A new user trying to configure a cluster needs to know the names of all these 
 four configs.
 The same issue exists for nodemanagers.
 It would be much easier if they could simply specify 
 yarn.resourcemanager.hostname and yarn.nodemanager.hostname and default ports 
 for the other ones would kick in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-04-03 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-193:
-

Attachment: YARN-193.13.patch

Fix the twice setting bug and change default max vcores to 4.

 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Hitesh Shah
Assignee: Zhijie Shen
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.10.patch, YARN-193.11.patch, YARN-193.12.patch, 
 YARN-193.13.patch, YARN-193.4.patch, YARN-193.5.patch, YARN-193.6.patch, 
 YARN-193.7.patch, YARN-193.8.patch, YARN-193.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-99) Jobs fail during resource localization when private distributed-cache hits unix directory limits

2013-04-03 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621128#comment-13621128
 ] 

Omkar Vinit Joshi commented on YARN-99:
---

Rebasing the patch as 467 is now committed.
This issue is related to 467 and the detailed information can be found here 
[underlying problem and proposed/implemented Solution | 
https://issues.apache.org/jira/browse/YARN-467?focusedCommentId=13615894page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13615894]

The only difference here is that the same problem is present in 
local-dir/usercache/user-name/filecache (Private user cache). We are using 
LocalCacheDirectoryManager for user-cache but not for app-cache as it is highly 
unlikely for application to have so many localized files.

Earlier implementation for private cache involved computing localized path 
inside ContainerLocalizer; i.e. in different processes. Now in order to 
centralize this we have moved it to ResourceLocalizationService.LocalizerRunner 
and this is communicated to all the ContainerLocalizer as a part of the 
heartbeat. Thereby we can now manage LocalCacheDirectory at one place.

 Jobs fail during resource localization when private distributed-cache hits 
 unix directory limits
 

 Key: YARN-99
 URL: https://issues.apache.org/jira/browse/YARN-99
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Devaraj K
Assignee: Omkar Vinit Joshi
 Attachments: yarn-99-20130324.patch


 If we have multiple jobs which uses distributed cache with small size of 
 files, the directory limit reaches before reaching the cache size and fails 
 to create any directories in file cache. The jobs start failing with the 
 below exception.
 {code:xml}
 java.io.IOException: mkdir of 
 /tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
   at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 We should have a mechanism to clean the cache files if it crosses specified 
 number of directories like cache size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-03 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen reassigned YARN-514:


Assignee: Zhijie Shen  (was: Bikas Saha)

 Delayed store operations should not result in RM unavailability for app 
 submission
 --

 Key: YARN-514
 URL: https://issues.apache.org/jira/browse/YARN-514
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Zhijie Shen

 Currently, app submission is the only store operation performed synchronously 
 because the app must be stored before the request returns with success. This 
 makes the RM susceptible to blocking all client threads on slow store 
 operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-536:
--

 Summary: Remove ContainerStatus, ContainerState from Container api 
interface as they will not be called by the container object
 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong


Remove containerstate, containerStatus from container interface. They will not 
be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-536:
--

Assignee: Xuan Gong

 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong

 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-404) Node Manager leaks Data Node connections

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-404:
-

Priority: Major  (was: Blocker)

Moving it off blocker status.

Devaraj, can you give us more information. Is this still happening? Tx.

 Node Manager leaks Data Node connections
 

 Key: YARN-404
 URL: https://issues.apache.org/jira/browse/YARN-404
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.6
Reporter: Devaraj K
Assignee: Devaraj K

 RM is missing to give some applications to NM for clean up, due to this log 
 aggregation is not happening for those applications and also it is leaking 
 data node connections in NM side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621180#comment-13621180
 ] 

Xuan Gong commented on YARN-536:


Remove getter and setter for ContainerState, ContainerStatus from container 
interface, remove those contents from proto file. There are some test code 
which used the getter and setter to get containerState or containerStatus from 
container object. 
/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/NodeManager.java
/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java

 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong

 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-101) If the heartbeat message loss, the nodestatus info of complete container will loss too.

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621201#comment-13621201
 ] 

Hudson commented on YARN-101:
-

Integrated in Hadoop-trunk-Commit #3554 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3554/])
YARN-101. Fix NodeManager heartbeat processing to not lose track of 
completed containers in case of dropped heartbeats. Contributed by Xuan Gong. 
(Revision 1464105)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1464105
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


 If  the heartbeat message loss, the nodestatus info of complete container 
 will loss too.
 

 Key: YARN-101
 URL: https://issues.apache.org/jira/browse/YARN-101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: suse.
Reporter: xieguiming
Assignee: Xuan Gong
Priority: Minor
 Fix For: 2.0.5-beta

 Attachments: YARN-101.1.patch, YARN-101.2.patch, YARN-101.3.patch, 
 YARN-101.4.patch, YARN-101.5.patch, YARN-101.6.patch


 see the red color:
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.java
  protected void startStatusUpdater() {
 new Thread(Node Status Updater) {
   @Override
   @SuppressWarnings(unchecked)
   public void run() {
 int lastHeartBeatID = 0;
 while (!isStopped) {
   // Send heartbeat
   try {
 synchronized (heartbeatMonitor) {
   heartbeatMonitor.wait(heartBeatInterval);
 }
 {color:red} 
 // Before we send the heartbeat, we get the NodeStatus,
 // whose method removes completed containers.
 NodeStatus nodeStatus = getNodeStatus();
  {color}
 nodeStatus.setResponseId(lastHeartBeatID);
 
 NodeHeartbeatRequest request = recordFactory
 .newRecordInstance(NodeHeartbeatRequest.class);
 request.setNodeStatus(nodeStatus);   
 {color:red} 
// But if the nodeHeartbeat fails, we've already removed the 
 containers away to know about it. We aren't handling a nodeHeartbeat failure 
 case here.
 HeartbeatResponse response =
   resourceTracker.nodeHeartbeat(request).getHeartbeatResponse();
{color} 
 if (response.getNodeAction() == NodeAction.SHUTDOWN) {
   LOG
   .info(Recieved SHUTDOWN signal from Resourcemanager as 
 part of heartbeat, +
hence shutting down.);
   NodeStatusUpdaterImpl.this.stop();
   break;
 }
 if (response.getNodeAction() == NodeAction.REBOOT) {
   LOG.info(Node is out of sync with ResourceManager,
   +  hence rebooting.);
   NodeStatusUpdaterImpl.this.reboot();
   break;
 }
 lastHeartBeatID = response.getResponseId();
 ListContainerId containersToCleanup = response
 .getContainersToCleanupList();
 if (containersToCleanup.size() != 0) {
   dispatcher.getEventHandler().handle(
   new CMgrCompletedContainersEvent(containersToCleanup));
 }
 ListApplicationId appsToCleanup =
 response.getApplicationsToCleanupList();
 //Only start tracking for keepAlive on FINISH_APP
 trackAppsForKeepAlive(appsToCleanup);
 if (appsToCleanup.size() != 0) {
   dispatcher.getEventHandler().handle(
   new CMgrCompletedAppsEvent(appsToCleanup));
 }
   } catch (Throwable e) {
 // TODO Better error handling. Thread can die with the rest of the
 // NM still running.
 LOG.error(Caught exception in status-updater, e);
   }
 }
   }
 }.start();
   }
   private NodeStatus getNodeStatus() {
 NodeStatus nodeStatus = recordFactory.newRecordInstance(NodeStatus.class);
 nodeStatus.setNodeId(this.nodeId);
 int numActiveContainers = 0;

[jira] [Commented] (YARN-381) Improve FS docs

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621203#comment-13621203
 ] 

Hudson commented on YARN-381:
-

Integrated in Hadoop-trunk-Commit #3554 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3554/])
YARN-381. Improve fair scheduler docs. Contributed by Sandy Ryza. (Revision 
1464130)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1464130
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Improve FS docs
 ---

 Key: YARN-381
 URL: https://issues.apache.org/jira/browse/YARN-381
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 2.0.5-beta

 Attachments: YARN-381.patch


 The MR2 FS docs could use some improvements.
 Configuration:
 - sizebasedweight - what is the size here? Total memory usage?
 Pool properties:
 - minResources - what does min amount of aggregate memory mean given that 
 this is not a reservation?
 - maxResources - is this a hard limit?
 - weight: How is this  ratio configured?  Eg base is 1 and all weights are 
 relative to that?
 - schedulingMode - what is the default? Is fifo pure fifo, eg waits until all 
 tasks for the job are finished before launching the next job?
 There's no mention of ACLs, even though they're supported. See the CS docs 
 for comparison.
 Also there are a couple typos worth fixing while we're at it, eg finish. 
 apps to run
 Worth keeping in mind that some of these will need to be updated to reflect 
 that resource calculators are now pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-427) Coverage fix for org.apache.hadoop.yarn.server.api.*

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621204#comment-13621204
 ] 

Hadoop QA commented on YARN-427:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576767/YARN-427-trunk-a.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/660//console

This message is automatically generated.

 Coverage fix for org.apache.hadoop.yarn.server.api.*
 

 Key: YARN-427
 URL: https://issues.apache.org/jira/browse/YARN-427
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-427-branch-2-a.patch, YARN-427-branch-2.patch, 
 YARN-427-trunk-a.patch, YARN-427-trunk.patch


 Coverage fix for org.apache.hadoop.yarn.server.api.*
 patch YARN-427-trunk.patch for trunk
 patch YARN-427-branch-2.patch for branch-2 and branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621215#comment-13621215
 ] 

Hadoop QA commented on YARN-465:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576782/YARN-465-trunk-a.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/659//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/659//console

This message is automatically generated.

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-425) coverage fix for yarn api

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621218#comment-13621218
 ] 

Hadoop QA commented on YARN-425:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576764/YARN-425-trunk-b.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/661//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/661//console

This message is automatically generated.

 coverage fix for yarn api
 -

 Key: YARN-425
 URL: https://issues.apache.org/jira/browse/YARN-425
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Aleksey Gorshkov
Assignee: Aleksey Gorshkov
 Attachments: YARN-425-branch-0.23.patch, YARN-425-branch-2-b.patch, 
 YARN-425-branch-2.patch, YARN-425-trunk-a.patch, YARN-425-trunk-b.patch, 
 YARN-425-trunk.patch


 coverage fix for yarn api
 patch YARN-425-trunk-a.patch for trunk
 patch YARN-425-branch-2.patch for branch-2
 patch YARN-425-branch-0.23.patch for branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621238#comment-13621238
 ] 

Hadoop QA commented on YARN-193:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576820/YARN-193.13.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/662//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/662//console

This message is automatically generated.

 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Hitesh Shah
Assignee: Zhijie Shen
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.10.patch, YARN-193.11.patch, YARN-193.12.patch, 
 YARN-193.13.patch, YARN-193.4.patch, YARN-193.5.patch, YARN-193.6.patch, 
 YARN-193.7.patch, YARN-193.8.patch, YARN-193.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-99) Jobs fail during resource localization when private distributed-cache hits unix directory limits

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621243#comment-13621243
 ] 

Hadoop QA commented on YARN-99:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12576823/yarn-99-20130403.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/663//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/663//console

This message is automatically generated.

 Jobs fail during resource localization when private distributed-cache hits 
 unix directory limits
 

 Key: YARN-99
 URL: https://issues.apache.org/jira/browse/YARN-99
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Devaraj K
Assignee: Omkar Vinit Joshi
 Attachments: yarn-99-20130324.patch, yarn-99-20130403.patch


 If we have multiple jobs which uses distributed cache with small size of 
 files, the directory limit reaches before reaching the cache size and fails 
 to create any directories in file cache. The jobs start failing with the 
 below exception.
 {code:xml}
 java.io.IOException: mkdir of 
 /tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
   at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 We should have a mechanism to clean the cache files if it crosses specified 
 number of directories like cache size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-516) TestContainerLocalizer.testContainerLocalizerMain is failing

2013-04-03 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621255#comment-13621255
 ] 

Eli Collins commented on YARN-516:
--

I reverted this change (and the initial HADOOP-9357 patch). We'll put this fix 
back in the HADOOP-9357 patch if we do another rev.

 TestContainerLocalizer.testContainerLocalizerMain is failing
 

 Key: YARN-516
 URL: https://issues.apache.org/jira/browse/YARN-516
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Andrew Wang
 Fix For: 2.0.5-beta

 Attachments: YARN-516.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Roger Hoover (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Hoover updated YARN-412:
--

Attachment: YARN-412.patch

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Roger Hoover (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Hoover updated YARN-412:
--

Attachment: (was: YARN-412.patch)

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-99) Jobs fail during resource localization when private distributed-cache hits unix directory limits

2013-04-03 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-99:
--

Attachment: yarn-99-20130403.1.patch

 Jobs fail during resource localization when private distributed-cache hits 
 unix directory limits
 

 Key: YARN-99
 URL: https://issues.apache.org/jira/browse/YARN-99
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Devaraj K
Assignee: Omkar Vinit Joshi
 Attachments: yarn-99-20130324.patch, yarn-99-20130403.1.patch, 
 yarn-99-20130403.patch


 If we have multiple jobs which uses distributed cache with small size of 
 files, the directory limit reaches before reaching the cache size and fails 
 to create any directories in file cache. The jobs start failing with the 
 below exception.
 {code:xml}
 java.io.IOException: mkdir of 
 /tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
   at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 We should have a mechanism to clean the cache files if it crosses specified 
 number of directories like cache size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-536:
---

Attachment: YARN-536.1.patch

 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-536.1.patch


 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-537) Waiting containers are not informed if private localization for a resource fails.

2013-04-03 Thread Omkar Vinit Joshi (JIRA)
Omkar Vinit Joshi created YARN-537:
--

 Summary: Waiting containers are not informed if private 
localization for a resource fails.
 Key: YARN-537
 URL: https://issues.apache.org/jira/browse/YARN-537
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi


In ResourceLocalizationService.LocalizerRunner.update() if localization fails 
then all the other waiting containers are not informed only the initiator is 
informed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (YARN-538) RM address DNS lookup can cause unnecessary slowness on every JHS page load

2013-04-03 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur moved MAPREDUCE-5111 to YARN-538:


  Component/s: (was: jobhistoryserver)
Affects Version/s: (was: 2.0.3-alpha)
   2.0.3-alpha
  Key: YARN-538  (was: MAPREDUCE-5111)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

 RM address DNS lookup can cause unnecessary slowness on every JHS page load 
 

 Key: YARN-538
 URL: https://issues.apache.org/jira/browse/YARN-538
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5111.patch


 When I run the job history server locally, every page load takes in the 10s 
 of seconds.  I profiled the process and discovered that all the extra time 
 was spent inside YarnConfiguration#getRMWebAppURL, trying to resolve 0.0.0.0 
 to a hostname.  When I changed my yarn.resourcemanager.address to localhost, 
 the page load times decreased drastically.
 There's no that we need to perform this resolution on every page load.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-539) Memory leak in case resource localization fails. LocalizedResource remains in memory.

2013-04-03 Thread Omkar Vinit Joshi (JIRA)
Omkar Vinit Joshi created YARN-539:
--

 Summary: Memory leak in case resource localization fails. 
LocalizedResource remains in memory.
 Key: YARN-539
 URL: https://issues.apache.org/jira/browse/YARN-539
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi


If resource localization fails then resource remains in memory and is
1) Either cleaned up when next time cache cleanup runs and there is space 
crunch. (If sufficient space in cache is available then it will remain in 
memory).
2) reused if LocalizationRequest comes again for the same resource.

I think when resource localization fails then that event should be sent to 
LocalResourceTracker which will then remove it from its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-458) YARN daemon addresses must be placed in many different configs

2013-04-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621319#comment-13621319
 ] 

Alejandro Abdelnur commented on YARN-458:
-

+1. Do we need to do this for HS as well? If so please open a new JIRA.

 YARN daemon addresses must be placed in many different configs
 --

 Key: YARN-458
 URL: https://issues.apache.org/jira/browse/YARN-458
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-458.patch


 The YARN resourcemanager's address is included in four different configs: 
 yarn.resourcemanager.scheduler.address, 
 yarn.resourcemanager.resource-tracker.address, yarn.resourcemanager.address, 
 and yarn.resourcemanager.admin.address
 A new user trying to configure a cluster needs to know the names of all these 
 four configs.
 The same issue exists for nodemanagers.
 It would be much easier if they could simply specify 
 yarn.resourcemanager.hostname and yarn.nodemanager.hostname and default ports 
 for the other ones would kick in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-516) TestContainerLocalizer.testContainerLocalizerMain is failing

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621324#comment-13621324
 ] 

Hudson commented on YARN-516:
-

Integrated in Hadoop-trunk-Commit #3555 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3555/])
Revert YARN-516 per HADOOP-9357. (Revision 1464181)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1464181
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java


 TestContainerLocalizer.testContainerLocalizerMain is failing
 

 Key: YARN-516
 URL: https://issues.apache.org/jira/browse/YARN-516
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Andrew Wang
 Fix For: 2.0.5-beta

 Attachments: YARN-516.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621338#comment-13621338
 ] 

Hadoop QA commented on YARN-412:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576838/YARN-412.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/665//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/665//console

This message is automatically generated.

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-04-03 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-193:
-

Attachment: YARN-193.14.patch

Fixed the buggy test TestResourceManager#testResourceManagerInitConfigValidation

 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Hitesh Shah
Assignee: Zhijie Shen
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.10.patch, YARN-193.11.patch, YARN-193.12.patch, 
 YARN-193.13.patch, YARN-193.14.patch, YARN-193.4.patch, YARN-193.5.patch, 
 YARN-193.6.patch, YARN-193.7.patch, YARN-193.8.patch, YARN-193.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-537) Waiting containers are not informed if private localization for a resource fails.

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621381#comment-13621381
 ] 

Vinod Kumar Vavilapalli commented on YARN-537:
--

Yup, I put in a comment long (long) time back asking why it isn't getting 
informed through the LocalizedResource which knows about all the waiting 
containers. I think we should do that.

 Waiting containers are not informed if private localization for a resource 
 fails.
 -

 Key: YARN-537
 URL: https://issues.apache.org/jira/browse/YARN-537
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi

 In ResourceLocalizationService.LocalizerRunner.update() if localization fails 
 then all the other waiting containers are not informed only the initiator is 
 informed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-539) LocalizedResources are leaked in memory in case resource localization fails

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-539:
-

Summary: LocalizedResources are leaked in memory in case resource 
localization fails  (was: Memory leak in case resource localization fails. 
LocalizedResource remains in memory.)

 LocalizedResources are leaked in memory in case resource localization fails
 ---

 Key: YARN-539
 URL: https://issues.apache.org/jira/browse/YARN-539
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi

 If resource localization fails then resource remains in memory and is
 1) Either cleaned up when next time cache cleanup runs and there is space 
 crunch. (If sufficient space in cache is available then it will remain in 
 memory).
 2) reused if LocalizationRequest comes again for the same resource.
 I think when resource localization fails then that event should be sent to 
 LocalResourceTracker which will then remove it from its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-458) YARN daemon addresses must be placed in many different configs

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621385#comment-13621385
 ] 

Vinod Kumar Vavilapalli commented on YARN-458:
--

+1 for the patch after the fact. Thanks for doing this Sandy.

 YARN daemon addresses must be placed in many different configs
 --

 Key: YARN-458
 URL: https://issues.apache.org/jira/browse/YARN-458
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.0.5-beta

 Attachments: YARN-458.patch


 The YARN resourcemanager's address is included in four different configs: 
 yarn.resourcemanager.scheduler.address, 
 yarn.resourcemanager.resource-tracker.address, yarn.resourcemanager.address, 
 and yarn.resourcemanager.admin.address
 A new user trying to configure a cluster needs to know the names of all these 
 four configs.
 The same issue exists for nodemanagers.
 It would be much easier if they could simply specify 
 yarn.resourcemanager.hostname and yarn.nodemanager.hostname and default ports 
 for the other ones would kick in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621390#comment-13621390
 ] 

Hadoop QA commented on YARN-536:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576856/YARN-536.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/667//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/667//console

This message is automatically generated.

 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-536.1.patch, YARN-536.2.patch


 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-535) TestUnmanagedAMLauncher can corrupt target/test-classes/yarn-site.xml during write phase, breaks later test runs

2013-04-03 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621573#comment-13621573
 ] 

Chris Nauroth commented on YARN-535:


{{TestDistributedShell#setup}} has nearly identical code to overwrite 
yarn-site.xml.

 TestUnmanagedAMLauncher can corrupt target/test-classes/yarn-site.xml during 
 write phase, breaks later test runs
 

 Key: YARN-535
 URL: https://issues.apache.org/jira/browse/YARN-535
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications
Affects Versions: 3.0.0
 Environment: OS/X laptop, HFS+ filesystem
Reporter: Steve Loughran
Priority: Minor

 the setup phase of {{TestUnmanagedAMLauncher}} overwrites {{yarn-site.xml}}. 
 As {{Configuration.writeXml()}} does a reread of all resources, this will 
 break if the (open-for-writing) resource is already visible as an empty file. 
 This leaves a corrupted {{target/test-classes/yarn-site.xml}}, which breaks 
 later test runs -because it is not overwritten by later incremental builds, 
 due to timestamps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Roger Hoover (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Hoover updated YARN-412:
--

Attachment: (was: YARN-412.patch)

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Roger Hoover (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Hoover updated YARN-412:
--

Attachment: YARN-412.patch

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621709#comment-13621709
 ] 

Hitesh Shah commented on YARN-412:
--

@Roger, for future reference ( may not be applicable to this jira ), it is good 
to leave earlier patch attachments lying around and not delete them when 
uploading newer patches. This can be used to trace review comments/feedback etc.

As for hadoop-common, mvn eclipse:eclipse, it can be ignored for now. It is a 
known issue with an open jira that has not been addressed yet.

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621720#comment-13621720
 ] 

Vinod Kumar Vavilapalli commented on YARN-536:
--

+1, this looks good. Checking it in.

 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-536.1.patch, YARN-536.2.patch


 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-536) Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621733#comment-13621733
 ] 

Hudson commented on YARN-536:
-

Integrated in Hadoop-trunk-Commit #3560 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3560/])
YARN-536. Removed the unused objects ContainerStatus and ContainerStatus 
from Container which also don't belong to the container. Contributed by Xuan 
Gong. (Revision 1464271)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1464271
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Container.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java


 Remove ContainerStatus, ContainerState from Container api interface as they 
 will not be called by the container object
 --

 Key: YARN-536
 URL: https://issues.apache.org/jira/browse/YARN-536
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.0.5-beta

 Attachments: YARN-536.1.patch, YARN-536.2.patch


 Remove containerstate, containerStatus from container interface. They will 
 not be called by container object

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621735#comment-13621735
 ] 

Hadoop QA commented on YARN-412:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576914/YARN-412.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/669//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/669//console

This message is automatically generated.

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira