[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler

2018-01-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314009#comment-16314009
 ] 

Hudson commented on YARN-7645:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13455 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13455/])
YARN-7645. (rkanter: rev 2aa4f0a55936239d35babd84da2a0d1a261bc9bd)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java


> TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is 
> flakey with FairScheduler
> -
>
> Key: YARN-7645
> URL: https://issues.apache.org/jira/browse/YARN-7645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 3.1.0
>
> Attachments: YARN-7645.001.patch
>
>
> We've noticed some flakiness in 
> {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}} 
> when using {{FairScheduler}}:
> {noformat}
> java.lang.AssertionError: Attempt state is not correct (timeout). 
> expected: but was:
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:254)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler

2018-01-05 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313888#comment-16313888
 ] 

Ray Chiang commented on YARN-7645:
--

+1.

I'm having difficulty reproducing the original error on my setup, but I'm not 
seeing any test issues with the new patch either.

> TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is 
> flakey with FairScheduler
> -
>
> Key: YARN-7645
> URL: https://issues.apache.org/jira/browse/YARN-7645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-7645.001.patch
>
>
> We've noticed some flakiness in 
> {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}} 
> when using {{FairScheduler}}:
> {noformat}
> java.lang.AssertionError: Attempt state is not correct (timeout). 
> expected: but was:
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:254)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler

2017-12-12 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288470#comment-16288470
 ] 

genericqa commented on YARN-7645:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
57s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7645 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901769/YARN-7645.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 858bc1266d1b 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 06f0eb2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/18893/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/18893/testReport/ |
| Max. process+thread count | 878 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler

2017-12-12 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288290#comment-16288290
 ] 

Robert Kanter commented on YARN-7645:
-

When the test passes, we see this sequence of log messages:
{noformat}
2017-12-11 11:21:36,837 INFO  [AsyncDispatcher event handler] 
attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(919)) - 
appattempt_1513020094849_0001_01 State change from SUBMITTED to SCHEDULED 
on event = ATTEMPT_ADDED
2017-12-11 11:21:36,837 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType:
 NODE_UPDATE
2017-12-11 11:21:36,837 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(870)) - Processing event for 
application_1513020094849_0001 of type NODE_UPDATE
2017-12-11 11:21:36,837 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:processNodeUpdate(986)) - Received node update 
event:NODE_USABLE for node:127.0.0.1:1234 with state:RUNNING
2017-12-11 11:21:36,837 INFO  [Thread-1] resourcemanager.MockRM 
(MockRM.java:waitForState(283)) - App State is : ACCEPTED
2017-12-11 11:21:36,838 INFO  [Thread-1] resourcemanager.MockRM 
(MockRM.java:waitForState(357)) - Attempt State is : SCHEDULED
2017-12-11 11:21:36,838 INFO  [Thread-1] resourcemanager.MockRM 
(MockRM.java:launchAM(1168)) - Launch AM appattempt_1513020094849_0001_01
2017-12-11 11:21:36,979 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStatusEvent.EventType:
 STATUS_UPDATE
2017-12-11 11:21:36,979 DEBUG [Thread-1] fair.FSLeafQueue 
(FSLeafQueue.java:updateDemand(322)) - The updated demand for root.default is 
; the max is 
2017-12-11 11:21:36,979 DEBUG [Thread-1] fair.FSLeafQueue 
(FSLeafQueue.java:updateDemand(324)) - The updated fairshare for root.default 
is 
2017-12-11 11:21:36,979 DEBUG [Thread-1] fair.FSParentQueue 
(FSParentQueue.java:updateDemand(133)) - Counting resource from root.default 
; Total resource demand for root now 
2017-12-11 11:21:36,979 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(666)) - Processing 127.0.0.1:1234 of type STATUS_UPDATE
2017-12-11 11:21:36,985 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent.EventType:
 NODE_UPDATE
2017-12-11 11:21:36,986 DEBUG [Thread-1] fair.FSLeafQueue 
(FSLeafQueue.java:updateDemand(322)) - The updated demand for root.user is 
; the max is 
2017-12-11 11:21:36,986 DEBUG [Thread-1] fair.FSLeafQueue 
(FSLeafQueue.java:updateDemand(324)) - The updated fairshare for root.user is 

2017-12-11 11:21:36,986 DEBUG [Thread-1] fair.FSParentQueue 
(FSParentQueue.java:updateDemand(133)) - Counting resource from root.user 
; Total resource demand for root now 
2017-12-11 11:21:36,986 DEBUG [Thread-1] fair.FSParentQueue 
(FSParentQueue.java:updateDemand(144)) - The updated demand for root is 
; the max is 
2017-12-11 11:21:36,986 DEBUG [Thread-1] fair.FSQueue 
(FSQueue.java:setFairShare(293)) - The updated fairShare for root is 

2017-12-11 11:21:36,987 DEBUG [AsyncDispatcher event handler] 
scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:nodeUpdate(1083)) - 
nodeUpdate: 127.0.0.1:1234 cluster capacity: 
2017-12-11 11:21:36,987 DEBUG [AsyncDispatcher event handler] 
scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:nodeUpdate(1116)) - 
Node being looked for scheduling 127.0.0.1:1234 availableResource: 

2017-12-11 11:21:36,988 DEBUG [AsyncDispatcher event handler] fair.FSLeafQueue 
(FSLeafQueue.java:assignContainer(333)) - Node 127.0.0.1 offered to queue: 
root.user fairShare: 
2017-12-11 11:21:37,049 DEBUG [AsyncDispatcher event handler] 
scheduler.AppSchedulingInfo 
(AppSchedulingInfo.java:updateMetricsForAllocatedContainer(589)) - allocate: 
applicationId=application_1513020094849_0001 
container=container_1513020094849_0001_01_01 host=127.0.0.1:1234 user=user 
resource= type=OFF_SWITCH
{noformat}
When it fails, we see this:
{noformat}
2017-12-08 11:58:46,248 INFO  [AsyncDispatcher event handler] 
attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(919)) -