[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2017-10-21 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3634:
---
Fix Version/s: 2.9.0

> TestMRTimelineEventHandling and TestApplication are broken
> --
>
> Key: YARN-3634
> URL: https://issues.apache.org/jira/browse/YARN-3634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-3634-YARN-2928.001.patch, 
> YARN-3634-YARN-2928.002.patch, YARN-3634-YARN-2928.003.patch, 
> YARN-3634-YARN-2928.004.patch
>
>
> TestMRTimelineEventHandling is broken. Relevant error message:
> {noformat}
> 2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
> (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
> asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry 
> policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
> collector.NodeTimelineCollectorManager 
> (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
> NM Collector Service for application_1431412130291_0001
> 2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
> containermanager.AuxServices 
> (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
> auxService name is timeline_collector and it got an error at event: 
> CONTAINER_INIT
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 
> to asf904.gq1.ygridcore.net:0 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see: 

[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2015-05-13 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3634:
--
Attachment: YARN-3634-YARN-2928.004.patch

Patch v.4 posted.

Yes, you're right. I missed that second check.

 TestMRTimelineEventHandling and TestApplication are broken
 --

 Key: YARN-3634
 URL: https://issues.apache.org/jira/browse/YARN-3634
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3634-YARN-2928.001.patch, 
 YARN-3634-YARN-2928.002.patch, YARN-3634-YARN-2928.003.patch, 
 YARN-3634-YARN-2928.004.patch


 TestMRTimelineEventHandling is broken. Relevant error message:
 {noformat}
 2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
 collector.NodeTimelineCollectorManager 
 (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
 NM Collector Service for application_1431412130291_0001
 2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
 containermanager.AuxServices 
 (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
 auxService name is timeline_collector and it got an error at event: 
 CONTAINER_INIT
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 
 to asf904.gq1.ygridcore.net:0 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 

[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2015-05-12 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3634:
--
Attachment: YARN-3634-YARN-2928.002.patch

Patch v.2 posted.

- fixed the findbugs issue
- fixed the TestApplication tests (existing failure)

 TestMRTimelineEventHandling and TestApplication are broken
 --

 Key: YARN-3634
 URL: https://issues.apache.org/jira/browse/YARN-3634
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3634-YARN-2928.001.patch, 
 YARN-3634-YARN-2928.002.patch


 TestMRTimelineEventHandling is broken. Relevant error message:
 {noformat}
 2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
 collector.NodeTimelineCollectorManager 
 (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
 NM Collector Service for application_1431412130291_0001
 2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
 containermanager.AuxServices 
 (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
 auxService name is timeline_collector and it got an error at event: 
 CONTAINER_INIT
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 
 to asf904.gq1.ygridcore.net:0 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 

[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2015-05-12 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3634:
--
Description: 
TestMRTimelineEventHandling is broken. Relevant error message:

{noformat}
2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
(Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry policy 
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
MILLISECONDS)
2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
collector.NodeTimelineCollectorManager 
(NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
NM Collector Service for application_1431412130291_0001
2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
containermanager.AuxServices 
(AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
auxService name is timeline_collector and it got an error at event: 
CONTAINER_INIT
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 to 
asf904.gq1.ygridcore.net:0 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
at 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.putIfAbsent(TimelineCollectorManager.java:97)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.addApplication(PerNodeTimelineCollectorsAuxService.java:99)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.initializeContainer(PerNodeTimelineCollectorsAuxService.java:126)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.handle(AuxServices.java:226)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.handle(AuxServices.java:49)
at 

[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2015-05-12 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3634:
--
Summary: TestMRTimelineEventHandling and TestApplication are broken  (was: 
TestMRTimelineEventHandling is broken due to timing issues)

 TestMRTimelineEventHandling and TestApplication are broken
 --

 Key: YARN-3634
 URL: https://issues.apache.org/jira/browse/YARN-3634
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3634-YARN-2928.001.patch


 TestMRTimelineEventHandling is broken. Relevant error message:
 {noformat}
 2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
 collector.NodeTimelineCollectorManager 
 (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
 NM Collector Service for application_1431412130291_0001
 2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
 containermanager.AuxServices 
 (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
 auxService name is timeline_collector and it got an error at event: 
 CONTAINER_INIT
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 
 to asf904.gq1.ygridcore.net:0 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 

[jira] [Updated] (YARN-3634) TestMRTimelineEventHandling and TestApplication are broken

2015-05-12 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3634:
--
Attachment: YARN-3634-YARN-2928.003.patch

Patch v3. posted

- fixed whitespace

 TestMRTimelineEventHandling and TestApplication are broken
 --

 Key: YARN-3634
 URL: https://issues.apache.org/jira/browse/YARN-3634
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3634-YARN-2928.001.patch, 
 YARN-3634-YARN-2928.002.patch, YARN-3634-YARN-2928.003.patch


 TestMRTimelineEventHandling is broken. Relevant error message:
 {noformat}
 2015-05-12 06:28:56,415 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:57,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:58,416 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:28:59,417 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:00,418 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:01,419 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:02,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:03,420 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:04,421 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,422 INFO  [AsyncDispatcher event handler] ipc.Client 
 (Client.java:handleConnectionFailure(882)) - Retrying connect to server: 
 asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
 MILLISECONDS)
 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] 
 collector.NodeTimelineCollectorManager 
 (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with 
 NM Collector Service for application_1431412130291_0001
 2015-05-12 06:29:05,425 WARN  [AsyncDispatcher event handler] 
 containermanager.AuxServices 
 (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The 
 auxService name is timeline_collector and it got an error at event: 
 CONTAINER_INIT
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 
 to asf904.gq1.ygridcore.net:0 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at