[jira] [Comment Edited] (YARN-10425) Replace the legacy placement engine in CS with the new one

2021-01-22 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270425#comment-17270425
 ] 

Ahmed Hussein edited comment on YARN-10425 at 1/22/21, 8:57 PM:


[~shuzirra], [~pbacsko], [~snemeth], [~BilwaST]  thanks for the contribution.
I have a question about the changes introduced by this Ticket. The following 
code block 
 is from 
[CSMappingPlacementRule#L128|https://github.com/apache/hadoop/commit/567600fd80896c1c9b0db1f228368d4eb2a694a2#diff-92b5797cf7739d330364d967172e65e61a859c776d9ebe526aba03ea33039033R127]
 

{code:java}
if (groups == null) {
  //We cannot use Groups#getUserToGroupsMappingService here, because when
  //tests change the HADOOP_SECURITY_GROUP_MAPPING, Groups won't refresh its
  //cached instance of groups, so we might get a Group instance which
  //ignores the HADOOP_SECURITY_GROUP_MAPPING settings.
  groups = new Groups(conf);
}
{code}

IIUC, the design of groups caching "{{Groups.cache}}" relies on the fact that 
the Groups being a singleton. Otherwise, there will be inconsistent behavior 
especially in classes like {{JniBasedUnixGroupsNetgroupMapping}} and 
{{ShellBasedUnixGroupsNetgroupMapping}}. Both mapping implementations have a 
second caching layer for the netgroups "{{NetgroupCache}}".
I have the following two concerns regarding an independent Groups instance in 
{{CSMappingPlacementRule.java}}
* It breaks the design leading to inconsistent behaviors that do not match the 
expected. As I mentioned, {{NetgroupCache}} contents won't be defined.
* Performance considerations. Allocating "N" instances of {{Groups}} means 
fetching the user's groups  "N" times. Therefore, Guava cacheLoader's refresh 
will be done "N" times, and so on.

Why did you decide to make that change instead of fixing the design of the unit 
tests?
IIUC, there is a need to fix that bug in a follow up Jira.


was (Author: ahussein):
[~shuzirra], [~pbacsko] thanks for the contribution.
I have a question about the changes introduced by this Ticket. The following 
code block 
 is from 
[CSMappingPlacementRule#L128|https://github.com/apache/hadoop/commit/567600fd80896c1c9b0db1f228368d4eb2a694a2#diff-92b5797cf7739d330364d967172e65e61a859c776d9ebe526aba03ea33039033R127]
 

{code:java}
if (groups == null) {
  //We cannot use Groups#getUserToGroupsMappingService here, because when
  //tests change the HADOOP_SECURITY_GROUP_MAPPING, Groups won't refresh its
  //cached instance of groups, so we might get a Group instance which
  //ignores the HADOOP_SECURITY_GROUP_MAPPING settings.
  groups = new Groups(conf);
}
{code}

IIUC, the design of groups caching "{{Groups.cache}}" relies on the fact that 
the Groups being a singleton. Otherwise, there will be inconsistent behavior 
especially in classes like {{JniBasedUnixGroupsNetgroupMapping}} and 
{{ShellBasedUnixGroupsNetgroupMapping}}. Both mapping implementations have a 
second caching layer for the netgroups "{{NetgroupCache}}".
I have the following two concerns regarding an independent Groups instance in 
{{CSMappingPlacementRule.java}}
* It breaks the design leading to inconsistent behaviors that do not match the 
expected. As I mentioned, {{NetgroupCache}} contents won't be defined.
* Performance considerations. Allocating "N" instances of {{Groups}} means 
fetching the user's groups  "N" times. Therefore, Guava cacheLoader's refresh 
will be done "N" times, and so on.

Why did you decide to make that change instead of fixing the design of the unit 
tests?
IIUC, there is a need to fix that bug in a follow up Jira.

> Replace the legacy placement engine in CS with the new one
> --
>
> Key: YARN-10425
> URL: https://issues.apache.org/jira/browse/YARN-10425
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10425.001.patch, YARN-10425.002.patch, 
> YARN-10425.003.patch, YARN-10425.004.patch, YARN-10425.005.patch, 
> YARN-10425.006.patch, YARN-10425.007.patch
>
>
> Remove the UserGroupMapping and ApplicationName mapping classes, and use the 
> new CSMappingPlacementRule instead. Also cleanup the orphan classes which are 
> used by these classes only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10425) Replace the legacy placement engine in CS with the new one

2021-01-22 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270425#comment-17270425
 ] 

Ahmed Hussein commented on YARN-10425:
--

[~shuzirra], [~pbacsko] thanks for the contribution.
I have a question about the changes introduced by this Ticket. The following 
code block 
 is from 
[CSMappingPlacementRule#L128|https://github.com/apache/hadoop/commit/567600fd80896c1c9b0db1f228368d4eb2a694a2#diff-92b5797cf7739d330364d967172e65e61a859c776d9ebe526aba03ea33039033R127]
 

{code:java}
if (groups == null) {
  //We cannot use Groups#getUserToGroupsMappingService here, because when
  //tests change the HADOOP_SECURITY_GROUP_MAPPING, Groups won't refresh its
  //cached instance of groups, so we might get a Group instance which
  //ignores the HADOOP_SECURITY_GROUP_MAPPING settings.
  groups = new Groups(conf);
}
{code}

IIUC, the design of groups caching "{{Groups.cache}}" relies on the fact that 
the Groups being a singleton. Otherwise, there will be inconsistent behavior 
especially in classes like {{JniBasedUnixGroupsNetgroupMapping}} and 
{{ShellBasedUnixGroupsNetgroupMapping}}. Both mapping implementations have a 
second caching layer for the netgroups "{{NetgroupCache}}".
I have the following two concerns regarding an independent Groups instance in 
{{CSMappingPlacementRule.java}}
* It breaks the design leading to inconsistent behaviors that do not match the 
expected. As I mentioned, {{NetgroupCache}} contents won't be defined.
* Performance considerations. Allocating "N" instances of {{Groups}} means 
fetching the user's groups  "N" times. Therefore, Guava cacheLoader's refresh 
will be done "N" times, and so on.

Why did you decide to make that change instead of fixing the design of the unit 
tests?
IIUC, there is a need to fix that bug in a follow up Jira.

> Replace the legacy placement engine in CS with the new one
> --
>
> Key: YARN-10425
> URL: https://issues.apache.org/jira/browse/YARN-10425
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10425.001.patch, YARN-10425.002.patch, 
> YARN-10425.003.patch, YARN-10425.004.patch, YARN-10425.005.patch, 
> YARN-10425.006.patch, YARN-10425.007.patch
>
>
> Remove the UserGroupMapping and ApplicationName mapping classes, and use the 
> new CSMappingPlacementRule instead. Also cleanup the orphan classes which are 
> used by these classes only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10568) TestTimelineClient#testTimelineClientCleanup fails on trunk

2021-01-11 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10568:


 Summary: TestTimelineClient#testTimelineClientCleanup fails on 
trunk
 Key: YARN-10568
 URL: https://issues.apache.org/jira/browse/YARN-10568
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineclient
Reporter: Ahmed Hussein


{{TestTimelineClient.testTimelineClientCleanup}} gives a NPE on trunk

{code:bash}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.client.api.impl.TestTimelineClient.testTimelineClientCleanup(TestTimelineClient.java:483)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10566) Elapsed time should be measured monotonicNow

2021-01-11 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10566:


 Summary: Elapsed time should be measured monotonicNow
 Key: YARN-10566
 URL: https://issues.apache.org/jira/browse/YARN-10566
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


I noticed that there is a widespread incorrect usage of 
{{System.currentTimeMillis()}}  throughout the yarn code.

For example:

{code:java}
// Some comments here
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < timeout) {
  // Do something
}
{code}

Elapsed time should be measured using `monotonicNow()`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10553) Refactor TestDistributedShell

2021-01-07 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10553:
-
Description: 
TestDistributedShell has grown so large over time. It has 29 tests.
 This is running the risk of exceeding 30 minutes limit for a single unit class.
 * The implementation has lots of code redundancy.
 * The Jira splits TestDistributedShell into three different unitTest for each 
TimeLineVersion: V1.0, 1.5, and 2.0
 * Fixes the broken test {{testDSShellWithEnforceExecutionType}}

  was:
TestDistributedShell has grown so large over time. It has 29 tests.
This is ru inning the risk of exceeding 30 minutes limit for a single unit 
class.

* The implementation has lots of code redundancy.
* It is inefficient in the setup and tearing down. The large percentage of time 
execution is exhausted by starting cluster and stopping the services.



> Refactor TestDistributedShell
> -
>
> Key: YARN-10553
> URL: https://issues.apache.org/jira/browse/YARN-10553
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available, refactoring, test
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> TestDistributedShell has grown so large over time. It has 29 tests.
>  This is running the risk of exceeding 30 minutes limit for a single unit 
> class.
>  * The implementation has lots of code redundancy.
>  * The Jira splits TestDistributedShell into three different unitTest for 
> each TimeLineVersion: V1.0, 1.5, and 2.0
>  * Fixes the broken test {{testDSShellWithEnforceExecutionType}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10040) DistributedShell test failure on X86 and ARM

2021-01-06 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260098#comment-17260098
 ] 

Ahmed Hussein edited comment on YARN-10040 at 1/6/21, 10:59 PM:


Thanks [~iwasakims] for fixing {{testDSShellWithOpportunisticContainers}}!
 I found the fix to \{{testDSShellWithEnforceExecutionType}}. It is part of the 
[PR-2581|https://github.com/apache/hadoop/pull/2581].

See the description of the bug in the unit test in my 
[comment-pr-2581|https://github.com/apache/hadoop/pull/2581#issuecomment-755765315]


was (Author: ahussein):
Thanks [~iwasakims] for fixing {{testDSShellWithOpportunisticContainers}}!
I found the fix to{{ testDSShellWithEnforceExecutionType}}. It is part of the 
[PR-2581|https://github.com/apache/hadoop/pull/2581].

See the description of the bug in the unit test in my 
[comment-pr-2581|https://github.com/apache/hadoop/pull/2581#issuecomment-755765315]

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10040) DistributedShell test failure on X86 and ARM

2021-01-06 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260098#comment-17260098
 ] 

Ahmed Hussein commented on YARN-10040:
--

Thanks [~iwasakims] for fixing {{testDSShellWithOpportunisticContainers}}!
I found the fix to{{ testDSShellWithEnforceExecutionType}}. It is part of the 
[PR-2581|https://github.com/apache/hadoop/pull/2581].

See the description of the bug in the unit test in my 
[comment-pr-2581|https://github.com/apache/hadoop/pull/2581#issuecomment-755765315]

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10556) Web-app server does not work for Timeline V2

2020-12-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10556:
-
Summary: Web-app server does not work for Timeline V2  (was: Web-app server 
does not work for V2 timeline)

> Web-app server does not work for Timeline V2
> 
>
> Key: YARN-10556
> URL: https://issues.apache.org/jira/browse/YARN-10556
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Ahmed Hussein
>Priority: Major
>
> {{TestDistributedShell}} for timeline version 2.0 shows the following errors 
> in the log files, with the below exception.
> There is a previous YARN-3087 that added a fix to the same issue before. 
> There is a need to investigate whether it is a testing issue or it the error 
> has resurfaced. 
> {code:bash}
> org.apache.hadoop.yarn.webapp.WebAppException: 
> /v2/timeline/clusters/yarn_cluster/apps/application_1609346161655_0001: 
> controller for v2 not found
>   at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:247)
>   at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:155)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:152)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1702)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
>   at 
> 

[jira] [Commented] (YARN-10556) Web-app server does not work for V2 timeline

2020-12-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17256623#comment-17256623
 ] 

Ahmed Hussein commented on YARN-10556:
--

[~gtcarrera9], [~sjlee0], [~sjlee], [~junping_du]

You guys are familiar with this error since you contributed to YARN-3087, Can 
you please give a quick look into the above errors?

> Web-app server does not work for V2 timeline
> 
>
> Key: YARN-10556
> URL: https://issues.apache.org/jira/browse/YARN-10556
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Ahmed Hussein
>Priority: Major
>
> {{TestDistributedShell}} for timeline version 2.0 shows the following errors 
> in the log files, with the below exception.
> There is a previous YARN-3087 that added a fix to the same issue before. 
> There is a need to investigate whether it is a testing issue or it the error 
> has resurfaced. 
> {code:bash}
> org.apache.hadoop.yarn.webapp.WebAppException: 
> /v2/timeline/clusters/yarn_cluster/apps/application_1609346161655_0001: 
> controller for v2 not found
>   at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:247)
>   at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:155)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:152)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1702)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> 

[jira] [Created] (YARN-10556) Web-app server does not work for V2 timeline

2020-12-30 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10556:


 Summary: Web-app server does not work for V2 timeline
 Key: YARN-10556
 URL: https://issues.apache.org/jira/browse/YARN-10556
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Ahmed Hussein


{{TestDistributedShell}} for timeline version 2.0 shows the following errors in 
the log files, with the below exception.
There is a previous YARN-3087 that added a fix to the same issue before. There 
is a need to investigate whether it is a testing issue or it the error has 
resurfaced. 


{code:bash}
org.apache.hadoop.yarn.webapp.WebAppException: 
/v2/timeline/clusters/yarn_cluster/apps/application_1609346161655_0001: 
controller for v2 not found
at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:247)
at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:155)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:152)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1702)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at 

[jira] [Assigned] (YARN-10553) Refactor TestDistributedShell

2020-12-29 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein reassigned YARN-10553:


Assignee: Ahmed Hussein

> Refactor TestDistributedShell
> -
>
> Key: YARN-10553
> URL: https://issues.apache.org/jira/browse/YARN-10553
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactoring, test
>
> TestDistributedShell has grown so large over time. It has 29 tests.
> This is ru inning the risk of exceeding 30 minutes limit for a single unit 
> class.
> * The implementation has lots of code redundancy.
> * It is inefficient in the setup and tearing down. The large percentage of 
> time execution is exhausted by starting cluster and stopping the services.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10553) Refactor TestDistributedShell

2020-12-28 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10553:


 Summary: Refactor TestDistributedShell
 Key: YARN-10553
 URL: https://issues.apache.org/jira/browse/YARN-10553
 Project: Hadoop YARN
  Issue Type: Bug
  Components: distributed-shell, test
Reporter: Ahmed Hussein


TestDistributedShell has grown so large over time. It has 29 tests.
This is ru inning the risk of exceeding 30 minutes limit for a single unit 
class.

* The implementation has lots of code redundancy.
* It is inefficient in the setup and tearing down. The large percentage of time 
execution is exhausted by starting cluster and stopping the services.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-22 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10040:
-
Priority: Major  (was: Blocker)

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-22 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253737#comment-17253737
 ] 

Ahmed Hussein edited comment on YARN-10040 at 12/22/20, 8:48 PM:
-

[~abmodi] can you suggest anyone familiar with the changes done in YARN-9697?


was (Author: ahussein):
I changed the status of this Jira to blocker.

[~abmodi] can you suggest anyone familiar with the changes done in YARN-9697?

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-22 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253737#comment-17253737
 ] 

Ahmed Hussein commented on YARN-10040:
--

I changed the status of this Jira to blocker.

[~abmodi] can you suggest anyone familiar with the changes done in YARN-9697?

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Blocker
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-22 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10040:
-
Priority: Blocker  (was: Major)

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Assignee: Abhishek Modi
>Priority: Blocker
> Attachments: YARN-10040.001.patch
>
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10334) TestDistributedShell leaks resources on timeout/failure

2020-12-17 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251298#comment-17251298
 ] 

Ahmed Hussein commented on YARN-10334:
--

Those are the steps going to fix the problem 
* YARN-10536 is going to make the thread responsive in. handling exceptions.
* Pass {{timeout}} argument to the {{DistributedShell.Client}}. This timeout 
has to be smaller than the {{TestDistributedShell.timeout}} rule.
* Optional: Client and YarnClient have no interfaces to shutdown/close. Adding 
such methods to be accessed by the unit tests will be a good addition in order 
to clean out the code.

> TestDistributedShell leaks resources on timeout/failure
> ---
>
> Key: YARN-10334
> URL: https://issues.apache.org/jira/browse/YARN-10334
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell, test, yarn
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: newbie, test
>
> {{TestDistributedShell}} times out on trunk. I found that the application, 
> and containers will stay running in the background long after the unit test 
> has failed.
> This causes failure of other test cases and several false positives failures 
> as result of:
> * Ports will stay busy, so other tests cases fail to launch.
> * Unit tests fail because of memory restrictions.
> Although the unit test is already broken on trunk, we do not want its 
> failures to other unit tests.
> {{TestDistributedShell}} needs to be revisited to make sure that all 
> {{YarnClients}}, and {{YarnApplications}} are closed properly at the end of 
> the each unit test (including exception and timeouts)
> Steps to reproduce:
> {code:bash}
> mvn test -Dtest=TestDistributedShell#testDSShellWithOpportunisticContainers
> ## this will timeout as
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 90.234 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
> [ERROR] 
> testDSShellWithOpportunisticContainers(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 90.018 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 9 
> milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:1117)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:1089)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers(TestDistributedShell.java:1438)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:748)
> [INFO] 
> [INFO] Results:
> [INFO] 
> [ERROR] Errors: 
> [ERROR]   TestDistributedShell.testDSShellWithOpportunisticContainers:1438 » 
> TestTimedOut
> [INFO] 
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
> {code}
> Using {{ps}} command, you can find the yarn processes are still in the 
> background
> {code:bash}
> /bin/bash -c $JRE_HOME/bin/java -Xmx512m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_type OPPORTUNISTIC --container_memory 128 --container_vcores 1 
> --num_containers 2 --priority 0 --appname DistributedShell --homedir 
> file:/Users/ahussein 
> 

[jira] [Commented] (YARN-10499) TestRouterWebServicesREST fails

2020-12-17 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251292#comment-17251292
 ] 

Ahmed Hussein commented on YARN-10499:
--

[~aajisaka] .. You are the man :)

It feels great to see the failing list down to:

 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/358/#showFailuresLink
{code:bash}
Test Result (6 failures / -202)
org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks.testSetRepIncWithUnderReplicatedBlocks
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testReadLockCanBeDisabledByConfig
org.apache.hadoop.yarn.sls.appmaster.TestAMSimulator.testAMSimulatorWithNodeLabels[1]
org.apache.hadoop.tools.dynamometer.TestDynamometerInfra.org.apache.hadoop.tools.dynamometer.TestDynamometerInfra
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
{code}


 

> TestRouterWebServicesREST fails
> ---
>
> Key: YARN-10499
> URL: https://issues.apache.org/jira/browse/YARN-10499
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2488/1/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn.txt]
> {noformat}
> [ERROR] Failures: 
> [ERROR]   
> TestRouterWebServicesREST.testAppAttemptXML:720->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   
> TestRouterWebServicesREST.testAppPriorityXML:796->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   TestRouterWebServicesREST.testAppQueueXML:846->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   TestRouterWebServicesREST.testAppStateXML:744->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   
> TestRouterWebServicesREST.testAppTimeoutXML:920->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   
> TestRouterWebServicesREST.testAppTimeoutsXML:896->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   TestRouterWebServicesREST.testAppXML:696->performGetCalls:274 
> expected:<200> but was:<204>
> [ERROR]   TestRouterWebServicesREST.testUpdateAppPriorityXML:832 
> expected:<200> but was:<500>
> [ERROR]   TestRouterWebServicesREST.testUpdateAppQueueXML:882 expected:<200> 
> but was:<500>
> [ERROR]   TestRouterWebServicesREST.testUpdateAppStateXML:782 expected:<202> 
> but was:<500>
> [ERROR] Errors: 
> [ERROR]   
> TestRouterWebServicesREST.testGetAppAttemptXML:1292->getAppAttempt:1464 » 
> ClientHandler
> [ERROR]   
> TestRouterWebServicesREST.testGetAppsMultiThread:1337->testGetContainersXML:1317->getAppAttempt:1464
>  » ClientHandler
> [ERROR]   
> TestRouterWebServicesREST.testGetContainersXML:1317->getAppAttempt:1464 » 
> ClientHandler {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10536) Client in distributedShell swallows interrupt exceptions

2020-12-17 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251173#comment-17251173
 ] 

Ahmed Hussein commented on YARN-10536:
--

[~ayushsaxena], [~inigoiri], [~epayne]
Can you please take a look at that small change?
After it is gets merged I will work on YARN-10536 to reduce the overhead of 
running those tests.

> Client in distributedShell swallows interrupt exceptions
> 
>
> Key: YARN-10536
> URL: https://issues.apache.org/jira/browse/YARN-10536
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, distributed-shell
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In {{applications.distributedshell.Client}} , the method 
> {{monitorApplication}} loops waiting for the following conditions:
> * Application fails: reaches {{YarnApplicationState.KILLED}}, or 
> {{YarnApplicationState.FAILED}}
> * Application succeeds: {{FinalApplicationStatus.SUCCEEDED}} or 
> {{YarnApplicationState.FINISHED}}
> * the time spent waiting is longer than {{clientTimeout}} (if it exists in 
> the parameters).
> When the Client thread is interrupted, it ignores the exception:
> {code:java}
>   // Check app status every 1 second.
>   try {
> Thread.sleep(1000);
>   } catch (InterruptedException e) {
> LOG.debug("Thread sleep in monitoring loop interrupted");
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10536) Client in distributedShell swallows interrupt exceptions

2020-12-16 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250778#comment-17250778
 ] 

Ahmed Hussein commented on YARN-10536:
--

The current implementation checks the timeout with reference to 
{{Client.clientStartTime}}. The latter is the timestamp of the object creation 
as shown in that [line of 
code|https://github.com/apache/hadoop/blob/df7f1e5199eed917ff40618708e7641238684d24/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java#L212].
The timeout should be measured when the client gets started (by calling 
{{run()}}) like in that [line of 
code|https://github.com/apache/hadoop/blob/df7f1e5199eed917ff40618708e7641238684d24/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java#L671].
 I do not think there is a point starting countdown on object creation?

> Client in distributedShell swallows interrupt exceptions
> 
>
> Key: YARN-10536
> URL: https://issues.apache.org/jira/browse/YARN-10536
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, distributed-shell
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{applications.distributedshell.Client}} , the method 
> {{monitorApplication}} loops waiting for the following conditions:
> * Application fails: reaches {{YarnApplicationState.KILLED}}, or 
> {{YarnApplicationState.FAILED}}
> * Application succeeds: {{FinalApplicationStatus.SUCCEEDED}} or 
> {{YarnApplicationState.FINISHED}}
> * the time spent waiting is longer than {{clientTimeout}} (if it exists in 
> the parameters).
> When the Client thread is interrupted, it ignores the exception:
> {code:java}
>   // Check app status every 1 second.
>   try {
> Thread.sleep(1000);
>   } catch (InterruptedException e) {
> LOG.debug("Thread sleep in monitoring loop interrupted");
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10536) Client in distributedShell swallows interrupt exceptions

2020-12-16 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10536:


 Summary: Client in distributedShell swallows interrupt exceptions
 Key: YARN-10536
 URL: https://issues.apache.org/jira/browse/YARN-10536
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, distributed-shell
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


In {{applications.distributedshell.Client}} , the method {{monitorApplication}} 
loops waiting for the following conditions:

* Application fails: reaches {{YarnApplicationState.KILLED}}, or 
{{YarnApplicationState.FAILED}}
* Application succeeds: {{FinalApplicationStatus.SUCCEEDED}} or 
{{YarnApplicationState.FINISHED}}
* the time spent waiting is longer than {{clientTimeout}} (if it exists in the 
parameters).

When the Client thread is interrupted, it ignores the exception:

{code:java}
  // Check app status every 1 second.
  try {
Thread.sleep(1000);
  } catch (InterruptedException e) {
LOG.debug("Thread sleep in monitoring loop interrupted");
  }
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-10 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247564#comment-17247564
 ] 

Ahmed Hussein edited comment on YARN-10040 at 12/11/20, 5:28 AM:
-

{quote}Abhishek Modi any pointers about this? Is the code only broken or just 
the test. If the functionality itself has some issue we should consider 
reverting YARN-9697, else if this is only a test issue, we should wrap this up, 
if there isn't a fix available we can disable this test for time being. Let me 
know what is the actual situation. I can try help in whichever way 
possible.{quote}

[~abmodi] Would you mind please taking a look at the failures?




was (Author: ahussein):
On iOS The {{TestDistributedShell}} does not run. But I thought to dump the 
error here because a NPE could be a hint to what's broken in the implementation.


{code:bash}
2020-12-10 17:29:22,129 INFO  [IPC Server listener on 8048] ipc.Server 
(Server.java:run(1344)) - IPC Server listener on 8048: starting
2020-12-10 17:29:22,131 INFO  [Listener at localhost/8048] 
collectormanager.NMCollectorService (NMCollectorService.java:serviceStart(101)) 
- NMCollectorService started at localhost/127.0.0.1:8048
2020-12-10 17:29:22,131 INFO  [Listener at localhost/8048] 
nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:serviceStart(267)) - Node ID assigned is : 
localhost:54943
2020-12-10 17:29:22,207 INFO  [Listener at localhost/8048] 
resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(617)) - NodeManager from node 
localhost(cmPort: 54943 httpPort: 54946) registered with capability: 
, assigned nodeId localhost:54943
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
security.NMContainerTokenSecretManager 
(NMContainerTokenSecretManager.java:setMasterKey(143)) - Rolling master-key for 
container-tokens, got key with id -210390460
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
security.NMTokenSecretManagerInNM 
(NMTokenSecretManagerInNM.java:setMasterKey(143)) - Rolling master-key for 
container-tokens, got key with id -1432443197
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:registerWithRM(486)) - Registered with 
ResourceManager as localhost:54943 with total resource of 
2020-12-10 17:29:22,212 INFO  [Listener at localhost/8048] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:updateCurrentKey(367)) - Updating 
the current master key for generating delegation tokens
2020-12-10 17:29:22,212 INFO  [Thread[Thread-282,5,FailOnTimeoutGroup]] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:run(701)) - Starting expired 
delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2020-12-10 17:29:22,212 INFO  [Thread[Thread-282,5,FailOnTimeoutGroup]] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:updateCurrentKey(367)) - Updating 
the current master key for generating delegation tokens
2020-12-10 17:29:22,212 INFO  [RM Event dispatcher] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(774)) - localhost:54943 Node Transitioned from NEW to 
UNHEALTHY
2020-12-10 17:29:22,214 INFO  
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] distributed.NodeQueueLoadMonitor 
(NodeQueueLoadMonitor.java:removeNode(202)) - Node delete event for: localhost
2020-12-10 17:29:22,215 ERROR [SchedulerEventDispatcher:Event Processor] 
capacity.CapacityScheduler (CapacityScheduler.java:removeNode(2127)) - 
Attempting to remove non-existent node localhost:54943
2020-12-10 17:29:22,215 ERROR 
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] event.EventDispatcher (MarkerIgnoringBase.java:error(159)) - Error 
in handling event type NODE_REMOVED to the Event Dispatcher
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.distributed.NodeQueueLoadMonitor.removeFromNodeIdsByRack(NodeQueueLoadMonitor.java:405)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.distributed.NodeQueueLoadMonitor.removeNode(NodeQueueLoadMonitor.java:204)
at 
org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.handle(OpportunisticContainerAllocatorAMService.java:399)
at 
org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.handle(OpportunisticContainerAllocatorAMService.java:94)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:71)
at java.lang.Thread.run(Thread.java:748)
2020-12-10 17:29:22,216 INFO  
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] 

[jira] [Assigned] (YARN-10334) TestDistributedShell leaks resources on timeout/failure

2020-12-10 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein reassigned YARN-10334:


Assignee: Ahmed Hussein

> TestDistributedShell leaks resources on timeout/failure
> ---
>
> Key: YARN-10334
> URL: https://issues.apache.org/jira/browse/YARN-10334
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell, test, yarn
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: newbie, test
>
> {{TestDistributedShell}} times out on trunk. I found that the application, 
> and containers will stay running in the background long after the unit test 
> has failed.
> This causes failure of other test cases and several false positives failures 
> as result of:
> * Ports will stay busy, so other tests cases fail to launch.
> * Unit tests fail because of memory restrictions.
> Although the unit test is already broken on trunk, we do not want its 
> failures to other unit tests.
> {{TestDistributedShell}} needs to be revisited to make sure that all 
> {{YarnClients}}, and {{YarnApplications}} are closed properly at the end of 
> the each unit test (including exception and timeouts)
> Steps to reproduce:
> {code:bash}
> mvn test -Dtest=TestDistributedShell#testDSShellWithOpportunisticContainers
> ## this will timeout as
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 90.234 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
> [ERROR] 
> testDSShellWithOpportunisticContainers(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 90.018 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 9 
> milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:1117)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:1089)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers(TestDistributedShell.java:1438)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:748)
> [INFO] 
> [INFO] Results:
> [INFO] 
> [ERROR] Errors: 
> [ERROR]   TestDistributedShell.testDSShellWithOpportunisticContainers:1438 » 
> TestTimedOut
> [INFO] 
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
> {code}
> Using {{ps}} command, you can find the yarn processes are still in the 
> background
> {code:bash}
> /bin/bash -c $JRE_HOME/bin/java -Xmx512m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_type OPPORTUNISTIC --container_memory 128 --container_vcores 1 
> --num_containers 2 --priority 0 --appname DistributedShell --homedir 
> file:/Users/ahussein 
> 1>$WORK_DIR8/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/TestDistributedShell/TestDistributedShell-logDir-nm-0_0/application_1593554710896_0001/container_1593554710896_0001_01_01/AppMaster.stdout
>  
> 2>$WORK_DIR8/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/TestDistributedShell/TestDistributedShell-logDir-nm-0_0/application_1593554710896_0001/container_1593554710896_0001_01_01/AppMaster.stderr
> $JRE_HOME/bin/java -Xmx512m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_type OPPORTUNISTIC --container_memory 128 

[jira] [Commented] (YARN-10040) DistributedShell test failure on X86 and ARM

2020-12-10 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247564#comment-17247564
 ] 

Ahmed Hussein commented on YARN-10040:
--

On iOS The {{TestDistributedShell}} does not run. But I thought to dump the 
error here because a NPE could be a hint to what's broken in the implementation.


{code:bash}
2020-12-10 17:29:22,129 INFO  [IPC Server listener on 8048] ipc.Server 
(Server.java:run(1344)) - IPC Server listener on 8048: starting
2020-12-10 17:29:22,131 INFO  [Listener at localhost/8048] 
collectormanager.NMCollectorService (NMCollectorService.java:serviceStart(101)) 
- NMCollectorService started at localhost/127.0.0.1:8048
2020-12-10 17:29:22,131 INFO  [Listener at localhost/8048] 
nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:serviceStart(267)) - Node ID assigned is : 
localhost:54943
2020-12-10 17:29:22,207 INFO  [Listener at localhost/8048] 
resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(617)) - NodeManager from node 
localhost(cmPort: 54943 httpPort: 54946) registered with capability: 
, assigned nodeId localhost:54943
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
security.NMContainerTokenSecretManager 
(NMContainerTokenSecretManager.java:setMasterKey(143)) - Rolling master-key for 
container-tokens, got key with id -210390460
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
security.NMTokenSecretManagerInNM 
(NMTokenSecretManagerInNM.java:setMasterKey(143)) - Rolling master-key for 
container-tokens, got key with id -1432443197
2020-12-10 17:29:22,210 INFO  [Listener at localhost/8048] 
nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:registerWithRM(486)) - Registered with 
ResourceManager as localhost:54943 with total resource of 
2020-12-10 17:29:22,212 INFO  [Listener at localhost/8048] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:updateCurrentKey(367)) - Updating 
the current master key for generating delegation tokens
2020-12-10 17:29:22,212 INFO  [Thread[Thread-282,5,FailOnTimeoutGroup]] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:run(701)) - Starting expired 
delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2020-12-10 17:29:22,212 INFO  [Thread[Thread-282,5,FailOnTimeoutGroup]] 
delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:updateCurrentKey(367)) - Updating 
the current master key for generating delegation tokens
2020-12-10 17:29:22,212 INFO  [RM Event dispatcher] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(774)) - localhost:54943 Node Transitioned from NEW to 
UNHEALTHY
2020-12-10 17:29:22,214 INFO  
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] distributed.NodeQueueLoadMonitor 
(NodeQueueLoadMonitor.java:removeNode(202)) - Node delete event for: localhost
2020-12-10 17:29:22,215 ERROR [SchedulerEventDispatcher:Event Processor] 
capacity.CapacityScheduler (CapacityScheduler.java:removeNode(2127)) - 
Attempting to remove non-existent node localhost:54943
2020-12-10 17:29:22,215 ERROR 
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] event.EventDispatcher (MarkerIgnoringBase.java:error(159)) - Error 
in handling event type NODE_REMOVED to the Event Dispatcher
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.distributed.NodeQueueLoadMonitor.removeFromNodeIdsByRack(NodeQueueLoadMonitor.java:405)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.distributed.NodeQueueLoadMonitor.removeNode(NodeQueueLoadMonitor.java:204)
at 
org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.handle(OpportunisticContainerAllocatorAMService.java:399)
at 
org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.handle(OpportunisticContainerAllocatorAMService.java:94)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:71)
at java.lang.Thread.run(Thread.java:748)
2020-12-10 17:29:22,216 INFO  
[org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService:Event
 Processor] event.EventDispatcher (EventDispatcher.java:run(84)) - Exiting, 
bbye..
2020-12-10 17:29:22,217 INFO  [Listener at localhost/8048] ipc.CallQueueManager 
(CallQueueManager.java:(93)) - Using callQueue: class 
java.util.concurrent.LinkedBlockingQueue, queueCapacity: 1000, scheduler: class 
org.apache.hadoop.ipc.DefaultRpcScheduler, ipcBackoff: false.
2020-12-10 17:29:22,218 INFO  [Socket Reader #1 for port 0] ipc.Server 
(Server.java:run(1265)) - Starting Socket Reader #1 for port 0
2020-12-10 17:29:22,222 INFO  [Listener at localhost/54947] 

[jira] [Comment Edited] (YARN-10494) CLI tool for docker-to-squashfs conversion (pure Java)

2020-12-02 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242719#comment-17242719
 ] 

Ahmed Hussein edited comment on YARN-10494 at 12/2/20, 8:54 PM:


Thanks [~ccondit] for the update.

I suggest to create a branch and WIP PR to make peer-reviews easier.


was (Author: ahussein):
Thanks [~ccondit] for the update.

> CLI tool for docker-to-squashfs conversion (pure Java)
> --
>
> Key: YARN-10494
> URL: https://issues.apache.org/jira/browse/YARN-10494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
> Attachments: YARN-10494.001.patch, 
> docker-to-squashfs-conversion-tool-design.pdf
>
>
> *YARN-9564* defines a docker-to-squashfs image conversion tool that relies on 
> python2, multiple libraries, squashfs-tools and root access in order to 
> convert Docker images to squashfs images for use with the runc container 
> runtime in YARN.
> *YARN-9943* was created to investigate alternatives, as the response to 
> merging YARN-9564 has not been very positive. This proposal outlines the 
> design for a CLI conversion tool in 100% pure Java that will work out of the 
> box.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10494) CLI tool for docker-to-squashfs conversion (pure Java)

2020-12-02 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242719#comment-17242719
 ] 

Ahmed Hussein commented on YARN-10494:
--

Thanks [~ccondit] for the update.

> CLI tool for docker-to-squashfs conversion (pure Java)
> --
>
> Key: YARN-10494
> URL: https://issues.apache.org/jira/browse/YARN-10494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
> Attachments: YARN-10494.001.patch, 
> docker-to-squashfs-conversion-tool-design.pdf
>
>
> *YARN-9564* defines a docker-to-squashfs image conversion tool that relies on 
> python2, multiple libraries, squashfs-tools and root access in order to 
> convert Docker images to squashfs images for use with the runc container 
> runtime in YARN.
> *YARN-9943* was created to investigate alternatives, as the response to 
> merging YARN-9564 has not been very positive. This proposal outlines the 
> design for a CLI conversion tool in 100% pure Java that will work out of the 
> box.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10468) TestNodeStatusUpdater does not handle early failure in threads

2020-11-11 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein reassigned YARN-10468:


Assignee: Ahmed Hussein

> TestNodeStatusUpdater does not handle early failure in threads
> --
>
> Key: YARN-10468
> URL: https://issues.apache.org/jira/browse/YARN-10468
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> While investigating HADOOP-17314, I found that the 
> * TestNodeStatusUpdater#testNMRegistration() will continue running {{while 
> (heartBeatID <= 3 && waitCount++ != 200) {}} even though the nm thread could 
> already be dead.  the unit should detect that the nm has died and terminates 
> sooner to release resources for other tests.
> * TestNodeStatusUpdater#testNMRMConnectionConf(). Same problem as described 
> above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10485) TimelineConnector swallows InterruptedException

2020-11-09 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10485:


 Summary: TimelineConnector swallows InterruptedException
 Key: YARN-10485
 URL: https://issues.apache.org/jira/browse/YARN-10485
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


Some tests timeout or take excessively long to shutdown because the 
{{TimelineConnector}} will catch InterruptedException and go into a retry loop 
instead of aborting.

[~daryn] reported that this makes debugging more difficult and he suggests the 
exception to be thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-05 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226909#comment-17226909
 ] 

Ahmed Hussein commented on YARN-10483:
--

Thanks [~weichiu] :) This is very helpful information.

> yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复
> --
>
> Key: YARN-10483
> URL: https://issues.apache.org/jira/browse/YARN-10483
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager, 
> RM
>Affects Versions: 3.1.1
>Reporter: jufeng li
>Priority: Blocker
> Attachments: RM_normal_state.stack, RM_unnormal_state.stack
>
>
> yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity 
> scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity 
> scheduler内部的锁出了问题。正常状态下和卡住状态下rm的jstack日志已上传,希望有人可以解决一下,此bug比较严重,直接导致生产不可用。没人解答待会我再来问



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-05 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226840#comment-17226840
 ] 

Ahmed Hussein edited comment on YARN-10483 at 11/5/20, 5:13 PM:


[~Jufeng] Can you please change the title and description of this Jira to 
English.

 I do not think it is a good idea to have multiple languages Jiras because it 
complicates searching for everyone.
 Thank You.


was (Author: ahussein):
[~Jufeng] Can you please change the title and description of this Jira to 
English.

 I do not think it is a good idea to have multiple languages Jiras because we 
it complicates searching for everyone.
Thank You.

> yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复
> --
>
> Key: YARN-10483
> URL: https://issues.apache.org/jira/browse/YARN-10483
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager, 
> RM
>Affects Versions: 3.1.1
>Reporter: jufeng li
>Priority: Blocker
> Attachments: RM_normal_state.stack, RM_unnormal_state.stack
>
>
> yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity 
> scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity 
> scheduler内部的锁出了问题。正常状态下和卡住状态下rm的jstack日志已上传,希望有人可以解决一下,此bug比较严重,直接导致生产不可用。没人解答待会我再来问



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-05 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein resolved YARN-10483.
--
Release Note: Please create Jiras that makes it easy for other developers 
to search and understand. 
  Resolution: Information Provided

> yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复
> --
>
> Key: YARN-10483
> URL: https://issues.apache.org/jira/browse/YARN-10483
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager, 
> RM
>Affects Versions: 3.1.1
>Reporter: jufeng li
>Priority: Blocker
> Attachments: RM_normal_state.stack, RM_unnormal_state.stack
>
>
> yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity 
> scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity 
> scheduler内部的锁出了问题。正常状态下和卡住状态下rm的jstack日志已上传,希望有人可以解决一下,此bug比较严重,直接导致生产不可用。没人解答待会我再来问



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-05 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226840#comment-17226840
 ] 

Ahmed Hussein commented on YARN-10483:
--

[~Jufeng] Can you please change the title and description of this Jira to 
English.

 I do not think it is a good idea to have multiple languages Jiras because we 
it complicates searching for everyone.
Thank You.

> yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复
> --
>
> Key: YARN-10483
> URL: https://issues.apache.org/jira/browse/YARN-10483
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler, resourcemanager, 
> RM
>Affects Versions: 3.1.1
>Reporter: jufeng li
>Priority: Blocker
> Attachments: RM_normal_state.stack, RM_unnormal_state.stack
>
>
> yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity 
> scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity 
> scheduler内部的锁出了问题。正常状态下和卡住状态下rm的jstack日志已上传,希望有人可以解决一下,此bug比较严重,直接导致生产不可用。没人解答待会我再来问



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10468) TestNodeStatusUpdater does not handle early failure in threads

2020-10-20 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10468:


 Summary: TestNodeStatusUpdater does not handle early failure in 
threads
 Key: YARN-10468
 URL: https://issues.apache.org/jira/browse/YARN-10468
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Ahmed Hussein


While investigating HADOOP-17314, I found that the 

* TestNodeStatusUpdater#testNMRegistration() will continue running {{while 
(heartBeatID <= 3 && waitCount++ != 200) {}} even though the nm thread could 
already be dead.  the unit should detect that the nm has died and terminates 
sooner to release resources for other tests.
* TestNodeStatusUpdater#testNMRMConnectionConf(). Same problem as described 
above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-09 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211436#comment-17211436
 ] 

Ahmed Hussein commented on YARN-10455:
--

Thanks [~ebadger].

> TestNMProxy.testNMProxyRPCRetry is not consistent
> -
>
> Key: YARN-10455
> URL: https://issues.apache.org/jira/browse/YARN-10455
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.1.2, 3.2.2, 3.4.0, 3.3.1, 2.10.2
>
> Attachments: YARN-10455-branch-2.10.001.patch, YARN-10455.001.patch
>
>
> The fix in YARN-8844 may fail depending on the configuration of the machine 
> running the test.
>  In some cases the address gets resolved and the Unit throws a connection 
> timeout exception instead. In such scenario the JUnit times out the main 
> reason behind the failure is swallowed by the shutdown of the clients.
>  To make sure that the JUnit behavior is consistent, a suggested fix is to 
> set the host address to {{127.0.0.1:1}}. The latter will omit the probability 
> of collisions on non-privileged ports.
>  Also, it is more correct to catch {{SocketException}} directly rather than 
> catching IOException with a check for not {{SocketException}}.
>  
> The stack trace with such failures:
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 24.293 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] 
> testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy)
>   Time elapsed: 20.18 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 2 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:821)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1645)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1461)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1414)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
>   at com.sun.proxy.$Proxy24.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>   at com.sun.proxy.$Proxy25.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:167)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   

[jira] [Commented] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-09 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211063#comment-17211063
 ] 

Ahmed Hussein commented on YARN-10455:
--

Thank you [~Jim_Brennan]! I uploaded a patch for branch-2.10.

> TestNMProxy.testNMProxyRPCRetry is not consistent
> -
>
> Key: YARN-10455
> URL: https://issues.apache.org/jira/browse/YARN-10455
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.1.2, 3.2.2, 3.4.0, 3.3.1
>
> Attachments: YARN-10455-branch-2.10.001.patch, YARN-10455.001.patch
>
>
> The fix in YARN-8844 may fail depending on the configuration of the machine 
> running the test.
>  In some cases the address gets resolved and the Unit throws a connection 
> timeout exception instead. In such scenario the JUnit times out the main 
> reason behind the failure is swallowed by the shutdown of the clients.
>  To make sure that the JUnit behavior is consistent, a suggested fix is to 
> set the host address to {{127.0.0.1:1}}. The latter will omit the probability 
> of collisions on non-privileged ports.
>  Also, it is more correct to catch {{SocketException}} directly rather than 
> catching IOException with a check for not {{SocketException}}.
>  
> The stack trace with such failures:
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 24.293 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] 
> testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy)
>   Time elapsed: 20.18 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 2 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:821)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1645)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1461)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1414)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
>   at com.sun.proxy.$Proxy24.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>   at com.sun.proxy.$Proxy25.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:167)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> 

[jira] [Updated] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-08 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10455:
-
Attachment: YARN-10455-branch-2.10.001.patch

> TestNMProxy.testNMProxyRPCRetry is not consistent
> -
>
> Key: YARN-10455
> URL: https://issues.apache.org/jira/browse/YARN-10455
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.1.2, 3.2.2, 3.4.0, 3.3.1
>
> Attachments: YARN-10455-branch-2.10.001.patch, YARN-10455.001.patch
>
>
> The fix in YARN-8844 may fail depending on the configuration of the machine 
> running the test.
>  In some cases the address gets resolved and the Unit throws a connection 
> timeout exception instead. In such scenario the JUnit times out the main 
> reason behind the failure is swallowed by the shutdown of the clients.
>  To make sure that the JUnit behavior is consistent, a suggested fix is to 
> set the host address to {{127.0.0.1:1}}. The latter will omit the probability 
> of collisions on non-privileged ports.
>  Also, it is more correct to catch {{SocketException}} directly rather than 
> catching IOException with a check for not {{SocketException}}.
>  
> The stack trace with such failures:
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 24.293 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] 
> testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy)
>   Time elapsed: 20.18 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 2 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:821)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1645)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1461)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1414)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
>   at com.sun.proxy.$Proxy24.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>   at com.sun.proxy.$Proxy25.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:167)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> 

[jira] [Commented] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-08 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210272#comment-17210272
 ] 

Ahmed Hussein commented on YARN-10455:
--

[~leftnoteasy], [~eyang], [~Jim_Brennan]
Can you please take at the patch?

> TestNMProxy.testNMProxyRPCRetry is not consistent
> -
>
> Key: YARN-10455
> URL: https://issues.apache.org/jira/browse/YARN-10455
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: YARN-10455.001.patch
>
>
> The fix in YARN-8844 may fail depending on the configuration of the machine 
> running the test.
>  In some cases the address gets resolved and the Unit throws a connection 
> timeout exception instead. In such scenario the JUnit times out the main 
> reason behind the failure is swallowed by the shutdown of the clients.
>  To make sure that the JUnit behavior is consistent, a suggested fix is to 
> set the host address to {{127.0.0.1:1}}. The latter will omit the probability 
> of collisions on non-privileged ports.
>  Also, it is more correct to catch {{SocketException}} directly rather than 
> catching IOException with a check for not {{SocketException}}.
>  
> The stack trace with such failures:
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 24.293 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
> [ERROR] 
> testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy)
>   Time elapsed: 20.18 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 2 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:821)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1645)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1461)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1414)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
>   at com.sun.proxy.$Proxy24.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>   at com.sun.proxy.$Proxy25.startContainers(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:167)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> 

[jira] [Updated] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-07 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10455:
-
Description: 
The fix in YARN-8844 may fail depending on the configuration of the machine 
running the test.
 In some cases the address gets resolved and the Unit throws a connection 
timeout exception instead. In such scenario the JUnit times out the main reason 
behind the failure is swallowed by the shutdown of the clients.
 To make sure that the JUnit behavior is consistent, a suggested fix is to set 
the host address to {{127.0.0.1:1}}. The latter will omit the probability of 
collisions on non-privileged ports.
 Also, it is more correct to catch {{SocketException}} directly rather than 
catching IOException with a check for not {{SocketException}}.

 

The stack trace with such failures:


{code:bash}
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
[ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 24.293 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy
[ERROR] 
testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy)
  Time elapsed: 20.18 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 2 
milliseconds
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:821)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1645)
at org.apache.hadoop.ipc.Client.call(Client.java:1461)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
at com.sun.proxy.$Proxy24.startContainers(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
at com.sun.proxy.$Proxy25.startContainers(Unknown Source)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

[INFO]
[INFO] 

[jira] [Updated] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-07 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10455:
-
Description: 
The fix in YARN-8844 may fail depending on the configuration of the machine 
running the test.
In some cases the address gets resolved and the Unit throws a connection 
timeout exception instead. In such scenario the JUnit times out the main reason 
behind the failure is swallowed by the shutdown of the clients.
To make sure that the JUnit behavior is consistent, a suggested fix is to set 
the host address to {{127.0.0.1:1}}. The latter will omit the probability of 
collisions on non-privileged ports.
Also,  it is more correct to catch {{SocketException}} directly rather than 
catching IOException with a check for not {{SocketException}}.

  was:
The fix in YARN-8844 may fail depending on the configuration of the machine 
running the test.
In some cases the address gets resolved and the Unit throws a connection 
timeout exception instead. In such scenario the JUnit times out the main reason 
behind the failure is swallowed by the shutdown of the clients.
To make sure that the JUnit behavior is consistent, a suggested fix is to set 
the host address to {{127.0.0.1:1}}. The latter will omit the probability of 
collisions on non-privileged ports.


> TestNMProxy.testNMProxyRPCRetry is not consistent
> -
>
> Key: YARN-10455
> URL: https://issues.apache.org/jira/browse/YARN-10455
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> The fix in YARN-8844 may fail depending on the configuration of the machine 
> running the test.
> In some cases the address gets resolved and the Unit throws a connection 
> timeout exception instead. In such scenario the JUnit times out the main 
> reason behind the failure is swallowed by the shutdown of the clients.
> To make sure that the JUnit behavior is consistent, a suggested fix is to set 
> the host address to {{127.0.0.1:1}}. The latter will omit the probability of 
> collisions on non-privileged ports.
> Also,  it is more correct to catch {{SocketException}} directly rather than 
> catching IOException with a check for not {{SocketException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10455) TestNMProxy.testNMProxyRPCRetry is not consistent

2020-10-07 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10455:


 Summary: TestNMProxy.testNMProxyRPCRetry is not consistent
 Key: YARN-10455
 URL: https://issues.apache.org/jira/browse/YARN-10455
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


The fix in YARN-8844 may fail depending on the configuration of the machine 
running the test.
In some cases the address gets resolved and the Unit throws a connection 
timeout exception instead. In such scenario the JUnit times out the main reason 
behind the failure is swallowed by the shutdown of the clients.
To make sure that the JUnit behavior is consistent, a suggested fix is to set 
the host address to {{127.0.0.1:1}}. The latter will omit the probability of 
collisions on non-privileged ports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10337) TestRMHATimelineCollectors fails on hadoop trunk

2020-07-02 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10337:


 Summary: TestRMHATimelineCollectors fails on hadoop trunk
 Key: YARN-10337
 URL: https://issues.apache.org/jira/browse/YARN-10337
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test, yarn
Reporter: Ahmed Hussein


{{TestRMHATimelineCollectors}} has been failing on trunk. I see it frequently 
in the qbt reports and the yetus reprts


{code:bash}
[INFO] Running 
org.apache.hadoop.yarn.server.resourcemanager.TestRMHATimelineCollectors
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.95 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.TestRMHATimelineCollectors
[ERROR] 
testRebuildCollectorDataOnFailover(org.apache.hadoop.yarn.server.resourcemanager.TestRMHATimelineCollectors)
  Time elapsed: 5.615 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMHATimelineCollectors.testRebuildCollectorDataOnFailover(TestRMHATimelineCollectors.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:80)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]   TestRMHATimelineCollectors.testRebuildCollectorDataOnFailover:105 
NullPointer
[INFO]
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[INFO]
[ERROR] There are test failures.

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10334) TestDistributedShell leaks resources on timeout/failure

2020-06-30 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10334:


 Summary: TestDistributedShell leaks resources on timeout/failure
 Key: YARN-10334
 URL: https://issues.apache.org/jira/browse/YARN-10334
 Project: Hadoop YARN
  Issue Type: Bug
  Components: distributed-shell, test, yarn
Reporter: Ahmed Hussein


{{TestDistributedShell}} times out on trunk. I found that the application, and 
containers will stay running in the background long after the unit test has 
failed.
This causes failure of other test cases and several false positives failures as 
result of:
* Ports will stay busy, so other tests cases fail to launch.
* Unit tests fail because of memory restrictions.

Although the unit test is already broken on trunk, we do not want its failures 
to other unit tests.
{{TestDistributedShell}} needs to be revisited to make sure that all 
{{YarnClients}}, and {{YarnApplications}} are closed properly at the end of the 
each unit test (including exception and timeouts)

Steps to reproduce:



{code:bash}
mvn test -Dtest=TestDistributedShell#testDSShellWithOpportunisticContainers

## this will timeout as
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 90.234 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
[ERROR] 
testDSShellWithOpportunisticContainers(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
  Time elapsed: 90.018 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 9 
milliseconds
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:1117)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:1089)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers(TestDistributedShell.java:1438)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   TestDistributedShell.testDSShellWithOpportunisticContainers:1438 » 
TestTimedOut
[INFO] 
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
{code}


Using {{ps}} command, you can find the yarn processes are still in the 
background

{code:bash}
/bin/bash -c $JRE_HOME/bin/java -Xmx512m 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
--container_type OPPORTUNISTIC --container_memory 128 --container_vcores 1 
--num_containers 2 --priority 0 --appname DistributedShell --homedir 
file:/Users/ahussein 
1>$WORK_DIR8/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/TestDistributedShell/TestDistributedShell-logDir-nm-0_0/application_1593554710896_0001/container_1593554710896_0001_01_01/AppMaster.stdout
 
2>$WORK_DIR8/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/TestDistributedShell/TestDistributedShell-logDir-nm-0_0/application_1593554710896_0001/container_1593554710896_0001_01_01/AppMaster.stderr


$JRE_HOME/bin/java -Xmx512m 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
--container_type OPPORTUNISTIC --container_memory 128 --container_vcores 1 
--num_containers 2 --priority 0 --appname DistributedShell --homedir 
file:/Users/ahussein
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10176) TestTimelineAuthFilterForV2 fails intermittently

2020-05-12 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105725#comment-17105725
 ] 

Ahmed Hussein commented on YARN-10176:
--


{code:bash}
lineservice.security.TestTimelineAuthFilterForV2
[ERROR] 
testPutTimelineEntities[1](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2)
  Time elapsed: 6.611 s  <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:293)
at 
org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:437)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   
TestTimelineAuthFilterForV2.testPutTimelineEntities:437->verifyEntity:293
[INFO]
[ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0
[INFO]
[ERROR] There are test failures.

{code}


> TestTimelineAuthFilterForV2 fails intermittently
> 
>
> Key: YARN-10176
> URL: https://issues.apache.org/jira/browse/YARN-10176
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Reporter: Ahmed Hussein
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestTimelineAuthFilterForV2 fails intermittently on trunk and branch-2.10.
> To reproduce the failure, execute TestTimelineAuthFilterForV2 inside a loop.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2
> [ERROR] Tests 

[jira] [Resolved] (YARN-10220) RM HA times out intermittently

2020-05-12 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein resolved YARN-10220.
--
Resolution: Cannot Reproduce

I will close it for now since I cannot reproduce the failures as reported in 
YARN-2710

> RM HA times out intermittently
> --
>
> Key: YARN-10220
> URL: https://issues.apache.org/jira/browse/YARN-10220
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Ahmed Hussein
>Assignee: Bilwa S T
>Priority: Major
>
> TestResourceTrackerOnHA Among other tests time out intermittently
> * TestApplicationClientProtocolOnHA
> * TestApplicationMasterServiceProtocolForTimelineV2
> * TestApplicationMasterServiceProtocolOnHA
> {code:bash}
> [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ 
> hadoop-yarn-client ---
> [INFO]
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 19.612 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] 
> testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA)
>   Time elapsed: 19.473 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 15000 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1452)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1405)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy93.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy94.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> 

[jira] [Commented] (YARN-10220) RM HA times out intermittently

2020-05-12 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105499#comment-17105499
 ] 

Ahmed Hussein commented on YARN-10220:
--

[~BilwaST], I could not reproduce it again for 3.x or 2.10.

I think this is good news then!

I will close it.

> RM HA times out intermittently
> --
>
> Key: YARN-10220
> URL: https://issues.apache.org/jira/browse/YARN-10220
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Ahmed Hussein
>Assignee: Bilwa S T
>Priority: Major
>
> TestResourceTrackerOnHA Among other tests time out intermittently
> * TestApplicationClientProtocolOnHA
> * TestApplicationMasterServiceProtocolForTimelineV2
> * TestApplicationMasterServiceProtocolOnHA
> {code:bash}
> [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ 
> hadoop-yarn-client ---
> [INFO]
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 19.612 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] 
> testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA)
>   Time elapsed: 19.473 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 15000 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1452)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1405)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy93.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy94.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> 

[jira] [Commented] (YARN-8959) TestContainerResizing fails randomly

2020-05-06 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100858#comment-17100858
 ] 

Ahmed Hussein commented on YARN-8959:
-

{quote}Could waitForThreadToWait() be switched to await(). I think it will be 
easiest to understand as there is already a large precedent and readers of the 
code will be familiar. It will also be less specialized code to maintain in the 
code base.{quote}

Thank [~jeagles] for the feedback.
You are right,  {{await()}} seems to be easier and more stable alternative for 
{{waitForThreadToWait}}.
I have uploaded new patches with the changes.

For 3.x, the UT seems to be stable without intermittent failures.

For 2.10, I see that the UT became more stable, but eventually it fails with a 
new error. I will create a new Jira to address that failure since it is 
different from the original failures that triggered this very Jira.


{code:bash}
[INFO] Running 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing
[ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.597 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing
[ERROR] 
testIncreaseContainerUnreservedWhenApplicationCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
  Time elapsed: 0.265 s  <<< FAILURE!
java.lang.AssertionError: expected null, but 
was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotNull(Assert.java:664)
at org.junit.Assert.assertNull(Assert.java:646)
at org.junit.Assert.assertNull(Assert.java:656)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted(TestContainerResizing.java:826)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   
TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted:826
 expected null, but 
was:
[INFO]
[ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0
{code}


> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959-branch-2.10.004.patch, 
> YARN-8959.001.patch, YARN-8959.002.patch, YARN-8959.003.patch
>
>
> 

[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-06 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959.003.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959-branch-2.10.004.patch, 
> YARN-8959.001.patch, YARN-8959.002.patch, YARN-8959.003.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-06 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.004.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959-branch-2.10.004.patch, 
> YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099351#comment-17099351
 ] 

Ahmed Hussein commented on YARN-8959:
-

The unit test had a race condition in testSimpleDecreaseContainer. I could not 
get failure for other test cases.
 * replace "assert" by GenericTestUtil.waitFor()

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10256) Refactor TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-05-04 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099227#comment-17099227
 ] 

Ahmed Hussein commented on YARN-10256:
--

Thanks [~jeagles]!

> Refactor 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> ---
>
> Key: YARN-10256
> URL: https://issues.apache.org/jira/browse/YARN-10256
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactoring, unit-test
> Fix For: 3.2.2, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-10256.001.patch
>
>
> In 3.x, 
> {{TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic}}
>  has redundant assertions. Since the UT throws timeout exception, 
> {{GenericTestsUtils.waitFor()}} guarantees that the predicate is met 
> successfully. Otherwise, the UT would throw a timeout exception.
> The redundant loop causes confusion in understanding the test unit and may 
> increase the possibility of failure in case the container terminates



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.003.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.002.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, YARN-8959.001.patch, 
> YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959.002.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: (was: YARN-8959-branch-2.10.001.patch)

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: (was: YARN-8959-branch-2.10.002.patch)

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: (was: YARN-8959.002.patch)

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959.002.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, 
> YARN-8959-branch-2.10.002.patch, YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.002.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, 
> YARN-8959-branch-2.10.002.patch, YARN-8959.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: (was: YARN-8959-branch-2.10.005.patch)

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, YARN-8959.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: (was: YARN-8959-branch-2.10.006.patch)

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, YARN-8959.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-01 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.006.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, 
> YARN-8959-branch-2.10.005.patch, YARN-8959-branch-2.10.006.patch, 
> YARN-8959.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-01 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959.001.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, 
> YARN-8959-branch-2.10.005.patch, YARN-8959.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-05-01 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.005.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch, 
> YARN-8959-branch-2.10.005.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8959) TestContainerResizing fails randomly

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-8959:

Attachment: YARN-8959-branch-2.10.001.patch

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10255) fix intermittent failure TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic in branch-2.10

2020-04-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097000#comment-17097000
 ] 

Ahmed Hussein commented on YARN-10255:
--

Thank [~jeagles] for the heads ups. I charted a new Jira YARN-10256 to refactor 
the unit test in 3.x.

> fix intermittent failure 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
>  in branch-2.10
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch
>
>
> * Backport YARN-7372 to branch-2.10
> UT failure in branch-2.10:
>  
> {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Updated] (YARN-10256) Refactor TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10256:
-
Attachment: YARN-10256.001.patch

> Refactor 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> ---
>
> Key: YARN-10256
> URL: https://issues.apache.org/jira/browse/YARN-10256
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: refactoring, unit-test
> Attachments: YARN-10256.001.patch
>
>
> In 3.x, 
> {{TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic}}
>  has redundant assertions. Since the UT throws timeout exception, 
> {{GenericTestsUtils.waitFor()}} guarantees that the predicate is met 
> successfully. Otherwise, the UT would throw a timeout exception.
> The redundant loop causes confusion in understanding the test unit and may 
> increase the possibility of failure in case the container terminates



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10256) Refactor TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10256:


 Summary: Refactor 
TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
 Key: YARN-10256
 URL: https://issues.apache.org/jira/browse/YARN-10256
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


In 3.x, 
{{TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic}}
 has redundant assertions. Since the UT throws timeout exception, 
{{GenericTestsUtils.waitFor()}} guarantees that the predicate is met 
successfully. Otherwise, the UT would throw a timeout exception.
The redundant loop causes confusion in understanding the test unit and may 
increase the possibility of failure in case the container terminates



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10255) fix intermittent failure TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic in branch-2.10

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Attachment: (was: YARN-10255.001.patch)

> fix intermittent failure 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
>  in branch-2.10
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch
>
>
> * Backport YARN-7372 to branch-2.10
> UT failure in branch-2.10:
>  
> {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Updated] (YARN-10255) fix intermittent failure TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic in branch-2.10

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Description: 
* Backport YARN-7372 to branch-2.10

UT failure in branch-2.10:

 
{noformat}
testContainerUpdateExecTypeGuaranteedToOpportunistic:
 

[jira] [Updated] (YARN-10255) fix intermittent failure TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic in branch-2.10

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Summary: fix intermittent failure 
TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
 in branch-2.10  (was: revisit fix to intermittent 
TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic)

> fix intermittent failure 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
>  in branch-2.10
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Commented] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096945#comment-17096945
 ] 

Ahmed Hussein commented on YARN-10255:
--

Oh I see. Thanks [~jeagles] for taking the time to give me those tips. I will 
make sure to follow that standard.
Thank you for bearing with me :)

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Commented] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096886#comment-17096886
 ] 

Ahmed Hussein commented on YARN-10255:
--

Thanks [~jeagles]

For 3.x there assertion are redundant. since the UT throws timeout exception, 
{{GenericTestsUtils.waitFor()}} guarantees that the predicate is met 
successfully. Otherwise, the UT would throw a timeout exception.
Since I needed to fix the same unit test for 2.10, I found it will be good to 
take the opportunity and reduce the poll interval, and remove the redundant 
loop to make the  loop as it was confusing to understand what was the purpose 
of it.

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Commented] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096861#comment-17096861
 ] 

Ahmed Hussein commented on YARN-10255:
--

I tested the UT in a loop and it did not fail for 100 minutes.

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Commented] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096855#comment-17096855
 ] 

Ahmed Hussein commented on YARN-10255:
--

For branch-2.10, the findbugs and javadoc reports are reporting to classes that 
have not been modified to the patch.

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Updated] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Labels: unit-test  (was: )

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: unit-test
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Updated] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Attachment: YARN-10255-branch-2.10.001.patch

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: YARN-10255-branch-2.10.001.patch, YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Updated] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10255:
-
Attachment: YARN-10255.001.patch

> revisit fix to intermittent 
> TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
> --
>
> Key: YARN-10255
> URL: https://issues.apache.org/jira/browse/YARN-10255
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: YARN-10255.001.patch
>
>
> Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix 
> in YARN-7372 has some redundancy in assertion that could be removed.
> UT failure in branch-2.10:
>  {noformat}
> testContainerUpdateExecTypeGuaranteedToOpportunistic:
>   message='expected:OPPORTUNISTIC but 
> 

[jira] [Created] (YARN-10255) revisit fix to intermittent TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic

2020-04-30 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10255:


 Summary: revisit fix to intermittent 
TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic
 Key: YARN-10255
 URL: https://issues.apache.org/jira/browse/YARN-10255
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein
Assignee: Ahmed Hussein


Creating this Jira to fix intermittent failure in branch-2.10. Also, the fix in 
YARN-7372 has some redundancy in assertion that could be removed.

UT failure in branch-2.10:

 {noformat}
testContainerUpdateExecTypeGuaranteedToOpportunistic:
 

[jira] [Updated] (YARN-10220) RM HA times out intermittently

2020-04-09 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-10220:
-
Affects Version/s: 3.3.0
   2.10.0
   3.2.1
   3.1.3

> RM HA times out intermittently
> --
>
> Key: YARN-10220
> URL: https://issues.apache.org/jira/browse/YARN-10220
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Ahmed Hussein
>Assignee: Bilwa S T
>Priority: Major
>
> TestResourceTrackerOnHA Among other tests time out intermittently
> * TestApplicationClientProtocolOnHA
> * TestApplicationMasterServiceProtocolForTimelineV2
> * TestApplicationMasterServiceProtocolOnHA
> {code:bash}
> [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ 
> hadoop-yarn-client ---
> [INFO]
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 19.612 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
> [ERROR] 
> testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA)
>   Time elapsed: 19.473 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 15000 
> milliseconds
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812)
>   at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1452)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1405)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy93.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy94.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> 

[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-04-01 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073170#comment-17073170
 ] 

Ahmed Hussein commented on YARN-2710:
-

[~ebadger] I know you did a tremendous job and I am very appreciative for what 
you have, still doing, and will do.
 I created a new Jira YARN-10220 and uploaded patch for branch-3.2.

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710-branch-3.2.003.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-04-01 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-2710:

Attachment: YARN-2710-branch-3.2.003.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710-branch-3.2.003.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10220) RM HA times out intermittently

2020-04-01 Thread Ahmed Hussein (Jira)
Ahmed Hussein created YARN-10220:


 Summary: RM HA times out intermittently
 Key: YARN-10220
 URL: https://issues.apache.org/jira/browse/YARN-10220
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ahmed Hussein


TestResourceTrackerOnHA Among other tests time out intermittently
* TestApplicationClientProtocolOnHA
* TestApplicationMasterServiceProtocolForTimelineV2
* TestApplicationMasterServiceProtocolOnHA
{code:bash}
[INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ 
hadoop-yarn-client ---
[INFO]
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 19.612 
s <<< FAILURE! - in org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
[ERROR] 
testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA)  
Time elapsed: 19.473 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 15000 
milliseconds
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy93.registerNodeManager(Unknown Source)
at 
org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy94.registerNodeManager(Unknown Source)
at 
org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:80)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]   TestResourceTrackerOnHA.testResourceTrackerOnHA:64 » 

[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-23 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17065222#comment-17065222
 ] 

Ahmed Hussein commented on YARN-2710:
-

For branch-3.2, it has different behavior than the issue description. The test 
case hangs while calling {{ResourceTrackerPBClientImpl.registerNodeManager()}}. 
Should this be a separate Jira?

 
{code:bash}
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 19.126 
s <<< FAILURE! - in org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
[ERROR] 
testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA)  
Time elapsed: 18.96 s  <<< ERROR!
java.lang.Exception: test timed out after 15000 milliseconds
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:336)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:699)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy92.registerNodeManager(Unknown Source)
at 
org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy93.registerNodeManager(Unknown Source)
at 
org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
 

 

+*the log output is*+
{code:bash}
020-03-23 19:56:10,731 INFO  server.MiniYARNCluster 
(MiniYARNCluster.java:waitForNodeManagersToConnect(793)) - All Node Managers 
connected in MiniYARNCluster
2020-03-23 19:56:10,842 INFO  client.ConfiguredRMFailoverProxyProvider 
(ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to 
rm2
2020-03-23 19:56:10,847 INFO  resourcemanager.RMAuditLogger 
(RMAuditLogger.java:logSuccess(386)) - USER=ahusseinIP=10.0.0.110   
OPERATION=Get Applications Request  TARGET=ClientRMService  RESULT=SUCCESS
2020-03-23 19:56:10,880 INFO  zookeeper.JUnit4ZKTestRunner 
(JUnit4ZKTestRunner.java:evaluate(53)) - RUNNING TEST METHOD 
testResourceTrackerOnHA
2020-03-23 19:56:25,884 ERROR 

[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-17 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061037#comment-17061037
 ] 

Ahmed Hussein commented on YARN-2710:
-

Thanks [~ebadger] for committing the patch to trunk and 2.10
{quote}[~ahussein], could you put up a patch for branch-3.2?
{quote}
branch-3.2 uses a different version of JUnit which does not accept TimeUnit as 
a parameter in the constructor of Timeout.

After I fixed the compilation error, the fix does not work for branch-3.2. I 
will need to further investigate that branch.

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 2.10.1, 3.4.0
>
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710.001.patch, YARN-2710.002.patch, 
> YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-16 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-2710:

Attachment: YARN-2710-branch-2.10.003.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710.001.patch, YARN-2710.002.patch, 
> YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-16 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-2710:

Attachment: YARN-2710.003.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-13 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058947#comment-17058947
 ] 

Ahmed Hussein commented on YARN-9427:
-

Thank you [~epayne]..You saved the patch before the COVID-19!

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4, 2.10.1
>
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch, YARN-9427.002.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-04 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051539#comment-17051539
 ] 

Ahmed Hussein commented on YARN-2710:
-

Thanks [~Jim_Brennan] for the review. I  uploaded two new patches with the new 
timeout rules. 

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-2710:

Attachment: YARN-2710.002.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-04 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-2710:

Attachment: YARN-2710-branch-2.10.002.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710.001.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-04 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051455#comment-17051455
 ] 

Ahmed Hussein commented on YARN-2710:
-

Oh thanks [~Jim_Brennan]! My bad, I believe I was confused thinking that the 
values were in seconds. I will readjust the global timeout accordingly in 
another patch.

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710.001.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-03-04 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051434#comment-17051434
 ] 

Ahmed Hussein commented on YARN-2710:
-

Thanks [~Jim_Brennan]
The original timeout of 15 seconds seemed to be too small to allow {{maximum 
retry count * retry delay}}. So, I increased the timeout of each test case 
since the main blocker happens while connecting to RM in the case of slow 
bootstrapping.

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710.001.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-02 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049512#comment-17049512
 ] 

Ahmed Hussein commented on YARN-9427:
-

[~prabhujoseph], [~abmodi] Can you please take a look at the patches for 
branch-2.10 and trunk?
It is a small change in the JUnit test.

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch, YARN-9427.002.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-02 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-9427:

Affects Version/s: 2.10.0

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 2.10.0, 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch, YARN-9427.002.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-02 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049330#comment-17049330
 ] 

Ahmed Hussein commented on YARN-9427:
-

This failure is hard to reproduce because. My intuition is that it only happens 
when the Unit pulls the status of the containers before they are done.
My fix is to wait for all the containers until they are done, then go forward 
with the assertion.

I also noticed that there are many possible improvements to be done in 
{{BaseContainerManagerTest}} and its child classes that can be addressed in a 
separate jira such as:
# Refactor the test cases to replace the long idle waits (i.e, 
{{Sleep(5000)}}). In some cases it is not clear what is the purpose of the 
wait, and whether it is enough or not.
# Replace coding blocks that poll on a certain condition with 
{{GenericTestUtils.waitFor()}}.
# Reduce the wait time between polls. Currently, it is not less than 1 second 
between each check.

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch, YARN-9427.002.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-02 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-9427:

Attachment: YARN-9427.002.patch

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch, YARN-9427.002.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-03-02 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-9427:

Attachment: YARN-9427-branch-2.10.002.patch

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427-branch-2.10.002.patch, 
> YARN-9427.001.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-02-28 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-9427:

Attachment: YARN-9427-branch-2.10.001.patch

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427-branch-2.10.001.patch, YARN-9427.001.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9427) TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically

2020-02-28 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein updated YARN-9427:

Attachment: YARN-9427.001.patch

> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers 
> fails sporadically
> 
>
> Key: YARN-9427
> URL: https://issues.apache.org/jira/browse/YARN-9427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: 
> TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers, 
> YARN-9427.001.patch
>
>
> Failed
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers
> {code}
> java.lang.AssertionError: expected:<2> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers(TestContainerSchedulerQueuing.java:1027)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10176) TestTimelineAuthFilterForV2 fails intermittently

2020-02-28 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047742#comment-17047742
 ] 

Ahmed Hussein edited comment on YARN-10176 at 2/28/20 3:38 PM:
---

The fix in YARN-9452 does not fix all the problems with 
{{TestTimelineAuthFilterForV2}}.


was (Author: ahussein):
The fix in YARN-9452 does not fix all the problems with \{{ 
TestTimelineAuthFilterForV2}}.

> TestTimelineAuthFilterForV2 fails intermittently
> 
>
> Key: YARN-10176
> URL: https://issues.apache.org/jira/browse/YARN-10176
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Reporter: Ahmed Hussein
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestTimelineAuthFilterForV2 fails intermittently on trunk and branch-2.10.
> To reproduce the failure, execute TestTimelineAuthFilterForV2 inside a loop.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 18.148 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2
> [ERROR] 
> testPutTimelineEntities[1](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2)
>   Time elapsed: 6.852 s  <<< FAILURE!
> java.lang.AssertionError: Entities should have been published successfully.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:416)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> 

  1   2   >