[jira] [Created] (YARN-7580) ContainersMonitorImpl logged message lacks detail when exceeding memory limits

2017-11-28 Thread Wilfred Spiegelenburg (JIRA)
Wilfred Spiegelenburg created YARN-7580:
---

 Summary: ContainersMonitorImpl logged message lacks detail when 
exceeding memory limits
 Key: YARN-7580
 URL: https://issues.apache.org/jira/browse/YARN-7580
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.1.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


Currently in the RM logs container memory usage for a container that exceeds 
the memory limit is reported like this:
{code}
2016-06-14 09:15:36,694 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report 
from attempt_1464251583966_0932_r_000876_0: Container 
[pid=134938,containerID=container_1464251583966_0932_01_002237] is running 
beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory 
used; 1.9 GB of 2.1 GB virtual memory used. Killing container.
{code}

Two enhancements as part of this jira:
- make it clearer which limit we exceed
- show exactly how much we exceeded the limit by



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7579) Add support for FPGA information shown in webUI

2017-11-28 Thread Zhankun Tang (JIRA)
Zhankun Tang created YARN-7579:
--

 Summary: Add support for FPGA information shown in webUI
 Key: YARN-7579
 URL: https://issues.apache.org/jira/browse/YARN-7579
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhankun Tang


Supports retrieving FPGA information from REST and viewing from webUI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7578) Extend TestDiskFailures.waitForDiskHealthCheck() sleeping time.

2017-11-28 Thread Guangming Zhang (JIRA)
Guangming Zhang created YARN-7578:
-

 Summary: Extend TestDiskFailures.waitForDiskHealthCheck() sleeping 
time.
 Key: YARN-7578
 URL: https://issues.apache.org/jira/browse/YARN-7578
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.1.0
 Environment: ARMv8 AArch64, Ubuntu16.04
Reporter: Guangming Zhang
Priority: Minor
 Fix For: 3.1.0


Thread.sleep() function is called to wait for NodeManager to identify disk 
failures. But in some cases, for example the lower-end hardware computer, the 
sleep time is too short so that the NodeManager may haven't finished 
identifying disk failures. This will occur test errors:

{code:java}
Running org.apache.hadoop.yarn.server.TestDiskFailures
Tests run: 3, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 17.686 
sec <<< FAILURE! - in org.apache.hadoop.yarn.server.TestDiskFailures
testLocalDirsFailures(org.apache.hadoop.yarn.server.TestDiskFailures)  
Time elapsed: 10.412 sec  <<< FAILURE!
java.lang.AssertionError: NodeManager could not identify disk failure.
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.verifyDisksHealth(TestDiskFailures.java:239)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.testDirsFailures(TestDiskFailures.java:186)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.testLocalDirsFailures(TestDiskFailures.java:99)

testLogDirsFailures(org.apache.hadoop.yarn.server.TestDiskFailures)  
Time elapsed: 5.99 sec  <<< FAILURE!
java.lang.AssertionError: NodeManager could not identify disk failure.
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.verifyDisksHealth(TestDiskFailures.java:239)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.testDirsFailures(TestDiskFailures.java:186)
at 
org.apache.hadoop.yarn.server.TestDiskFailures.testLogDirsFailures(TestDiskFailures.java:111)

{code}

 So extend the sleep time from 1000ms to 1500ms to avoid some unit test errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-11-28 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/

[Nov 27, 2017 6:19:58 PM] (jianhe) YARN-6168. Restarted RM may not inform AM 
about all existing containers.
[Nov 27, 2017 10:31:52 PM] (yufei) YARN-7363. ContainerLocalizer don't have a 
valid log4j config in case of
[Nov 28, 2017 3:48:55 AM] (yqlin) HDFS-12858. Add router admin commands usage 
in HDFS commands reference
[Nov 28, 2017 11:52:59 AM] (stevel) HADOOP-15042. Azure 
PageBlobInputStream.skip() can return negative value
[Nov 28, 2017 1:07:11 PM] (sunilg) YARN-7499. Layout changes to Application 
details page in new YARN UI.




-1 overall


The following subsystems voted -1:
asflicense findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
   org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources At Resource.java:by 
returning Resource.resources At Resource.java:[line 213] 

Failed junit tests :

   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure 
   hadoop.fs.viewfs.TestViewFileSystemLinkFallback 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 
   hadoop.fs.viewfs.TestViewFsWithXAttrs 
   hadoop.hdfs.TestQuota 
   hadoop.hdfs.TestMaintenanceState 
   hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy 
   hadoop.hdfs.TestSetrepIncreasing 
   hadoop.hdfs.TestDFSStripedInputStream 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 
   hadoop.fs.viewfs.TestViewFileSystemHdfs 
   hadoop.hdfs.TestDFSStripedOutputStream 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 
   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.server.balancer.TestBalancerRPCDelay 
   hadoop.hdfs.TestUnsetAndChangeDirectoryEcPolicy 
   hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.TestErasureCodingPolicies 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 
   hadoop.fs.TestUnbuffer 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 
   hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesAttempts 
   hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs 
   hadoop.yarn.service.TestServiceAM 
   hadoop.yarn.service.TestYarnNativeServices 
   hadoop.yarn.sls.nodemanager.TestNMSimulator 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-compile-javac-root.txt
  [276K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/whitespace-eol.txt
  [8.8M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/whitespace-tabs.txt
  [288K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [1.8M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [80K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/607/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [104K]
   

[jira] [Created] (YARN-7577) Unit Fail: TestAMRestart#testPreemptedAMRestartOnRMRestart

2017-11-28 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-7577:


 Summary: Unit Fail: TestAMRestart#testPreemptedAMRestartOnRMRestart
 Key: YARN-7577
 URL: https://issues.apache.org/jira/browse/YARN-7577
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Miklos Szegedi
Assignee: Miklos Szegedi


This happens, if Fair Scheduler is the default. The test should run with both 
schedulers



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7576) Findbug warning for Resource exposing internal representation

2017-11-28 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7576:


 Summary: Findbug warning for Resource exposing internal 
representation
 Key: YARN-7576
 URL: https://issues.apache.org/jira/browse/YARN-7576
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 3.0.0
Reporter: Jason Lowe


Precommit builds are complaining about a findbugs warning:
{noformat}
EI  org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources

Bug type EI_EXPOSE_REP (click for details)
In class org.apache.hadoop.yarn.api.records.Resource
In method org.apache.hadoop.yarn.api.records.Resource.getResources()
Field org.apache.hadoop.yarn.api.records.Resource.resources
At Resource.java:[line 213]

Returning a reference to a mutable object value stored in one of the object's 
fields exposes the internal representation of the object.  If instances are 
accessed by untrusted code, and unchecked changes to the mutable object would 
compromise security or other important properties, you will need to do 
something different. Returning a new copy of the object is better approach in 
many situations.
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7575) When using absolute capacity configuration with no max capacity, scheduler UI NPEs and can't grow queue

2017-11-28 Thread Eric Payne (JIRA)
Eric Payne created YARN-7575:


 Summary: When using absolute capacity configuration with no max 
capacity, scheduler UI NPEs and can't grow queue
 Key: YARN-7575
 URL: https://issues.apache.org/jira/browse/YARN-7575
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Reporter: Eric Payne


I encountered the following while reviewing and testing branch YARN-5881.

The design document from YARN-5881 says that for max-capacity:
{quote}
3)  For each queue, we require:
a) if max-resource not set, it automatically set to parent.max-resource
{quote}

When I try leaving blank {{yarn.scheduler.capacity.< 
queue-path>.maximum-capacity}}, the RMUI scheduler page refuses to render. It 
looks like it's in {{CapacitySchedulerPage$ LeafQueueInfoBlock}}:
{noformat}
2017-11-28 11:29:16,974 [qtp43473566-220] ERROR webapp.Dispatcher: error 
handling URI: /cluster/scheduler
java.lang.reflect.InvocationTargetException
...
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderQueueCapacityInfo(CapacitySchedulerPage.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderLeafQueueInfoWithoutParition(CapacitySchedulerPage.java:129)
{noformat}

Also... A job will run in the leaf queue with no max capacity set and it will 
grow to the max capacity of the cluster, but if I add resources to the node, 
the job won't grow any more even though it has pending resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7574) Add support for Node Labels on Auto Created Leaf Queue Template

2017-11-28 Thread Suma Shivaprasad (JIRA)
Suma Shivaprasad created YARN-7574:
--

 Summary: Add support for Node Labels on Auto Created Leaf Queue 
Template
 Key: YARN-7574
 URL: https://issues.apache.org/jira/browse/YARN-7574
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad


YARN-7473 adds support for auto created leaf queues to inherit node labels 
capacities from parent queues. Howebver there is no support for leaf queue 
template to allow different configured capacities for different node labels. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7573) Gpu Information page could be empty for nodes without GPU

2017-11-28 Thread Sunil G (JIRA)
Sunil G created YARN-7573:
-

 Summary: Gpu Information page could be empty for nodes without GPU
 Key: YARN-7573
 URL: https://issues.apache.org/jira/browse/YARN-7573
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: webapp, yarn-ui-v2
Reporter: Sunil G
Assignee: Sunil G


In new YARN UI, node page is not accessible if that node doesnt have any GPU.
Also Under node page, when we click on "List of Containers/Applications", Gpu 
Information left nave is disappearing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org