Hi Tushar,
Thanks For the reply. I am attaching detailed log below FYR. I do not see
"Shutdown after reaching failure threshold for" message in the log. One more
information I got after talking with my colleague who looked at my logs is
it is due to some security bug it is not able to connect HDFS with username
& password. So I can relate with what you replied . " You told some operator
is continuously failing in DAG and due to that application getting killed"
-> yes I have HDFS operator which writes to HDFS our final output, looks
like it is failing and may be due to this application is getting killed.
Could be one of the reason because
*
2017-03-02 12:18:08,861 WARN ipc.Client
(Client.java:handleConnectionFailure(886)) - Failed to connect to server:
d-3zkvk02.target.com/10.66.241.46:8030: retries get failed due to exceeded
maximum allowed retries number: 0
java.net.ConnectException: Connection refused*
this warning message I am getting throughout the log.
FULL LOG FYR==>
2017-03-02 21:38:39,094 INFO hdfs.DFSClient
(DFSClient.java:getDelegationToken(1043)) - Created HDFS_DELEGATION_TOKEN
token 5002831 for SVFFLHDS on ha-hdfs:littleredns
2017-03-02 21:38:39,095 WARN ipc.Client
(Client.java:handleConnectionFailure(886)) - Failed to connect to server:
d-3zkvk02.target.com/10.66.241.46:8032: retries get failed due to exceeded
maximum allowed retries number: 0
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618)
at org.apache.hadoop.ipc.Client.call(Client.java:1449)
at org.apache.hadoop.ipc.Client.call(Client.java:1396)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy93.getDelegationToken(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getDelegationToken(ApplicationClientProtocolPBClientImpl.java:310)
at sun.reflect.GeneratedMethodAccessor150.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
at com.sun.proxy.$Proxy94.getDelegationToken(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getRMDelegationToken(YarnClientImpl.java:531)
at
com.datatorrent.stram.client.StramClientUtils$ClientRMHelper.addRMDelegationToken(StramClientUtils.java:281)
at
com.datatorrent.stram.security.StramUserLogin$1.run(StramUserLogin.java:114)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at
com.datatorrent.stram.security.StramUserLogin.refreshTokens(StramUserLogin.java:98)
at
com.datatorrent.stram.StreamingAppMasterService.execute(StreamingAppMasterService.java:751)
at
com.datatorrent.stram.StreamingAppMasterService.run(StreamingAppMasterService.java:647)
at
com.datatorrent.stram.StreamingAppMaster.main(StreamingAppMaster.java:104)
2017-03-02 21:38:39,095 INFO client.ConfiguredRMFailoverProxyProvider
(ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over
to rm2
2017-03-02 21:38:39,107 INFO client.StramClientUtils$ClientRMHelper
(StramClientUtils.java:addRMDelegationToken(287)) - Yarn Resource Manager HA
is enabled
2017-03-02 21:38:39,107 INFO client.StramClientUtils$ClientRMHelper
(StramClientUtils.java:getRMHAToken(265)) - Yarn Resource Manager id: rm1
2017-03-02 21:38:39,107 INFO client.StramClientUtils$ClientRMHelper
(StramClientUtils.java:getRMHAToken(265)) - Yarn Resource Manager id: rm2
2017-03-02 21:38:39,107 INFO client.StramClientUtils$ClientRMHelper
(StramClientUtils.java:addRMDelegationToken(298)) - RM dt Kind:
RM_DELEGATION_TOKEN, Service: 10.66.241.46:8032,10.66.241.14:8032, Ident:
([email protected], renewer=yarn, realUser=,
issueDate=1488512319104, maxDate=1489117119104, sequenceNumber=1309582,
masterKeyId=8466)
2017-03-02 21:38:40,111 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:41,115 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:42,119 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:43,124 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:44,128 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:45,132 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:46,136 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:47,140 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:48,143 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:49,147 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:50,151 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:51,154 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:52,160 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:53,164 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:54,167 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:55,168 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:monitorHeartbeat(789)) - Container
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
heartbeat timeout (30795 ms).
2017-03-02 21:38:55,171 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:56,172 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1119)) - Requested stop
container container_e3076_1487635160678_27652_01_001538
2017-03-02 21:38:56,172 INFO impl.NMClientAsyncImpl
(NMClientAsyncImpl.java:run(536)) - Processing Event EventType:
STOP_CONTAINER for Container container_e3076_1487635160678_27652_01_001538
2017-03-02 21:38:56,172 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:monitorHeartbeat(789)) - Container
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
heartbeat timeout (31799 ms).
2017-03-02 21:38:56,178 INFO impl.NMClientImpl
(NMClientImpl.java:stopContainer(242)) - ok, stopContainerInternal..
container_e3076_1487635160678_27652_01_001538
2017-03-02 21:38:56,178 INFO impl.ContainerManagementProtocolProxy
(ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy :
brdn1164.target.com:45454
2017-03-02 21:38:56,183 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:57,184 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:execute(929)) - Completed
containerId=container_e3076_1487635160678_27652_01_001538, state=COMPLETE,
exitStatus=-105, diagnostics=Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
2017-03-02 21:38:57,185 INFO
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:cancelToken(520)) - Token
cancelation requested for identifier: owner=SVFFLHDS, renewer=, realUser=,
issueDate=1488512300792, maxDate=4611687506939688695, sequenceNumber=1508,
masterKeyId=2
2017-03-02 21:38:57,185 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:scheduleContainerRestart(1122)) - Initiating
recovery for
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
2017-03-02 21:38:57,185 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:scheduleContainerRestart(1138)) - Affected
operators [PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]]
2017-03-02 21:38:57,187 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.io.IOException: Connection is not open
at
com.datatorrent.stram.util.PubSubWebSocketClient.assertUsable(PubSubWebSocketClient.java:264)
at
com.datatorrent.stram.util.PubSubWebSocketClient.publish(PubSubWebSocketClient.java:287)
at
com.datatorrent.stram.util.SharedPubSubWebSocketClient.publish(SharedPubSubWebSocketClient.java:120)
at
com.datatorrent.stram.FSEventRecorder$EventRecorderThread.run(FSEventRecorder.java:79)
2017-03-02 21:38:57,203 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.net.ConnectException: Connection refused: /10.66.18.168:9090
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.onFutureFailure(NettyConnectListener.java:133)
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.operationComplete(NettyConnectListener.java:145)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:409)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:400)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:362)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /10.66.18.168:9090
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
... 8 more
2017-03-02 21:38:57,206 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:58,206 INFO stram.ResourceRequestHandler
(ResourceRequestHandler.java:getHost(240)) - Strict anti-affinity = [] for
container with operators PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]
2017-03-02 21:38:58,206 INFO stram.ResourceRequestHandler
(ResourceRequestHandler.java:getHost(272)) - Found host null
2017-03-02 21:38:58,206 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1103)) - Asking RM for
containers: [Capability[<memory:10240, vCores:1>]Priority[1508]]
2017-03-02 21:38:58,207 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1105)) - Requested
container: Capability[<memory:10240, vCores:1>]Priority[1508] on host:
[null]
2017-03-02 21:38:58,210 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:38:59,211 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:execute(864)) - Got new container.,
containerId=container_e3076_1487635160678_27652_01_001539,
containerNode=brdn1164.target.com:45454,
containerNodeURI=brdn1164.target.com:8042, containerResourceMemory10240,
priority1508
2017-03-02 21:38:59,211 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:assignContainer(1256)) - Removing container
agent container_e3076_1487635160678_27652_01_001538
2017-03-02 21:38:59,212 INFO
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:createPassword(385)) - Creating
password for identifier: owner=SVFFLHDS, renewer=, realUser=,
issueDate=1488512339212, maxDate=4611687506939727115, sequenceNumber=1509,
masterKeyId=2, currentKey: 2
2017-03-02 21:38:59,212 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:run(150)) - Setting up container launch
context for containerid=container_e3076_1487635160678_27652_01_001539
2017-03-02 21:38:59,212 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:setClasspath(125)) - CLASSPATH:
./*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:.
2017-03-02 21:38:59,239 INFO util.BasicContainerOptConfigurator
(BasicContainerOptConfigurator.java:getJVMOptions(65)) - property map for
operator {-Xmx=9216m, Generic=null}
2017-03-02 21:38:59,239 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:getChildVMCommand(243)) - Jvm opts
-Xmx9663676416 for container container_e3076_1487635160678_27652_01_001539
2017-03-02 21:38:59,239 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:run(191)) - Launching on node:
brdn1164.target.com:45454 command: $JAVA_HOME/bin/java -Xmx9663676416
-Ddt.attr.APPLICATION_PATH=hdfs://littleredns/user/SVFFLHDS/datatorrent/apps/application_1487635160678_27652
-Djava.io.tmpdir=$PWD/tmp
-Ddt.cid=container_e3076_1487635160678_27652_01_001539
-Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=<LOG_DIR>
com.datatorrent.stram.engine.StreamingContainer 1><LOG_DIR>/stdout
2><LOG_DIR>/stderr
2017-03-02 21:38:59,239 INFO impl.NMClientAsyncImpl
(NMClientAsyncImpl.java:run(536)) - Processing Event EventType:
START_CONTAINER for Container container_e3076_1487635160678_27652_01_001539
2017-03-02 21:38:59,239 INFO impl.ContainerManagementProtocolProxy
(ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy :
brdn1164.target.com:45454
2017-03-02 21:38:59,240 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.net.ConnectException: Connection refused: /10.66.18.168:9090
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.onFutureFailure(NettyConnectListener.java:133)
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.operationComplete(NettyConnectListener.java:145)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:409)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:400)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:362)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /10.66.18.168:9090
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
... 8 more
2017-03-02 21:38:59,244 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:00,247 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:01,251 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:01,361 INFO ipc.Server (Server.java:saslProcess(1538)) -
Auth successful for SVFFLHDS (auth:TOKEN)
2017-03-02 21:39:01,365 INFO authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for SVFFLHDS (auth:TOKEN) for protocol=interface
com.datatorrent.stram.api.StreamingContainerUmbilicalProtocol
2017-03-02 21:39:01,535 INFO stram.StreamingContainerParent
(StreamingContainerParent.java:log(166)) - child msg:
[container_e3076_1487635160678_27652_01_001539] Entering heartbeat loop..
context:
PTContainer[id=15(container_e3076_1487635160678_27652_01_001539),state=ALLOCATED,operators=[PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]]]
2017-03-02 21:39:02,255 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:02,734 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:processHeartbeat(1459)) - Container
container_e3076_1487635160678_27652_01_001539 buffer server:
brdn1164.target.com:60138
2017-03-02 21:39:03,259 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:04,264 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:05,267 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:06,272 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:07,278 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:08,283 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:09,288 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:10,293 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:11,297 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:12,302 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:13,307 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:14,311 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:15,314 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:16,319 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:17,323 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:18,327 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:19,331 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:20,335 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:21,338 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:22,342 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:23,346 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:24,351 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:25,355 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:26,359 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:27,362 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:28,366 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:29,370 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:30,375 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:31,379 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:32,383 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:33,383 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:monitorHeartbeat(789)) - Container
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
heartbeat timeout (30649 ms).
2017-03-02 21:39:33,386 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:34,386 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1119)) - Requested stop
container container_e3076_1487635160678_27652_01_001539
2017-03-02 21:39:34,386 INFO impl.NMClientAsyncImpl
(NMClientAsyncImpl.java:run(536)) - Processing Event EventType:
STOP_CONTAINER for Container container_e3076_1487635160678_27652_01_001539
2017-03-02 21:39:34,388 INFO impl.NMClientImpl
(NMClientImpl.java:stopContainer(242)) - ok, stopContainerInternal..
container_e3076_1487635160678_27652_01_001539
2017-03-02 21:39:34,388 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:monitorHeartbeat(789)) - Container
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
heartbeat timeout (31654 ms).
2017-03-02 21:39:34,388 INFO impl.ContainerManagementProtocolProxy
(ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy :
brdn1164.target.com:45454
2017-03-02 21:39:34,391 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:35,392 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:execute(929)) - Completed
containerId=container_e3076_1487635160678_27652_01_001539, state=COMPLETE,
exitStatus=-105, diagnostics=Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
2017-03-02 21:39:35,392 INFO
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:cancelToken(520)) - Token
cancelation requested for identifier: owner=SVFFLHDS, renewer=, realUser=,
issueDate=1488512339212, maxDate=4611687506939727115, sequenceNumber=1509,
masterKeyId=2
2017-03-02 21:39:35,392 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:scheduleContainerRestart(1122)) - Initiating
recovery for
container_e3076_1487635160678_27652_01_001...@brdn1164.target.com:45454
2017-03-02 21:39:35,392 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:scheduleContainerRestart(1138)) - Affected
operators [PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]]
2017-03-02 21:39:35,395 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.io.IOException: Connection is not open
at
com.datatorrent.stram.util.PubSubWebSocketClient.assertUsable(PubSubWebSocketClient.java:264)
at
com.datatorrent.stram.util.PubSubWebSocketClient.publish(PubSubWebSocketClient.java:287)
at
com.datatorrent.stram.util.SharedPubSubWebSocketClient.publish(SharedPubSubWebSocketClient.java:120)
at
com.datatorrent.stram.FSEventRecorder$EventRecorderThread.run(FSEventRecorder.java:79)
2017-03-02 21:39:35,409 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.net.ConnectException: Connection refused: /10.66.18.168:9090
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.onFutureFailure(NettyConnectListener.java:133)
at
org.apache.apex.shaded.ning19.com.ning.http.client.providers.netty.request.NettyConnectListener.operationComplete(NettyConnectListener.java:145)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:409)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:400)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:362)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.apache.apex.shaded.ning19.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /10.66.18.168:9090
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at
org.apache.apex.shaded.ning19.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
... 8 more
2017-03-02 21:39:35,412 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:36,412 INFO stram.ResourceRequestHandler
(ResourceRequestHandler.java:getHost(240)) - Strict anti-affinity = [] for
container with operators PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]
2017-03-02 21:39:36,413 INFO stram.ResourceRequestHandler
(ResourceRequestHandler.java:getHost(272)) - Found host null
2017-03-02 21:39:36,413 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1103)) - Asking RM for
containers: [Capability[<memory:10240, vCores:1>]Priority[1509]]
2017-03-02 21:39:36,413 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:sendContainerAskToRM(1105)) - Requested
container: Capability[<memory:10240, vCores:1>]Priority[1509] on host:
[null]
2017-03-02 21:39:36,417 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:37,418 INFO stram.StreamingAppMasterService
(StreamingAppMasterService.java:execute(864)) - Got new container.,
containerId=container_e3076_1487635160678_27652_01_001540,
containerNode=brdn1164.target.com:45454,
containerNodeURI=brdn1164.target.com:8042, containerResourceMemory10240,
priority1509
2017-03-02 21:39:37,418 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:assignContainer(1256)) - Removing container
agent container_e3076_1487635160678_27652_01_001539
2017-03-02 21:39:37,419 INFO
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:createPassword(385)) - Creating
password for identifier: owner=SVFFLHDS, renewer=, realUser=,
issueDate=1488512377419, maxDate=4611687506939765322, sequenceNumber=1510,
masterKeyId=2, currentKey: 2
2017-03-02 21:39:37,419 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:run(150)) - Setting up container launch
context for containerid=container_e3076_1487635160678_27652_01_001540
2017-03-02 21:39:37,419 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:setClasspath(125)) - CLASSPATH:
./*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:.
2017-03-02 21:39:42,215 INFO util.BasicContainerOptConfigurator
(BasicContainerOptConfigurator.java:getJVMOptions(65)) - property map for
operator {-Xmx=9216m, Generic=null}
2017-03-02 21:39:42,229 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:getChildVMCommand(243)) - Jvm opts
-Xmx9663676416 for container container_e3076_1487635160678_27652_01_001540
2017-03-02 21:39:42,230 INFO stram.LaunchContainerRunnable
(LaunchContainerRunnable.java:run(191)) - Launching on node:
brdn1164.target.com:45454 command: $JAVA_HOME/bin/java -Xmx9663676416
-Ddt.attr.APPLICATION_PATH=hdfs://littleredns/user/SVFFLHDS/datatorrent/apps/application_1487635160678_27652
-Djava.io.tmpdir=$PWD/tmp
-Ddt.cid=container_e3076_1487635160678_27652_01_001540
-Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=<LOG_DIR>
com.datatorrent.stram.engine.StreamingContainer 1><LOG_DIR>/stdout
2><LOG_DIR>/stderr
2017-03-02 21:39:42,230 INFO impl.NMClientAsyncImpl
(NMClientAsyncImpl.java:run(536)) - Processing Event EventType:
START_CONTAINER for Container container_e3076_1487635160678_27652_01_001540
2017-03-02 21:39:42,230 INFO impl.ContainerManagementProtocolProxy
(ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy :
brdn1164.target.com:45454
2017-03-02 21:39:42,231 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.io.IOException: Connection is not open
at
com.datatorrent.stram.util.PubSubWebSocketClient.assertUsable(PubSubWebSocketClient.java:264)
at
com.datatorrent.stram.util.PubSubWebSocketClient.publish(PubSubWebSocketClient.java:287)
at
com.datatorrent.stram.util.SharedPubSubWebSocketClient.publish(SharedPubSubWebSocketClient.java:120)
at
com.datatorrent.stram.FSEventRecorder$EventRecorderThread.run(FSEventRecorder.java:79)
2017-03-02 21:39:42,234 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:43,238 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:44,122 INFO ipc.Server (Server.java:saslProcess(1538)) -
Auth successful for SVFFLHDS (auth:TOKEN)
2017-03-02 21:39:44,127 INFO authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for SVFFLHDS (auth:TOKEN) for protocol=interface
com.datatorrent.stram.api.StreamingContainerUmbilicalProtocol
2017-03-02 21:39:44,242 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:44,267 INFO stram.StreamingContainerParent
(StreamingContainerParent.java:log(166)) - child msg:
[container_e3076_1487635160678_27652_01_001540] Entering heartbeat loop..
context:
PTContainer[id=15(container_e3076_1487635160678_27652_01_001540),state=ALLOCATED,operators=[PTOperator[id=13,name=Retriggered_Recs_Hdfs_Write]]]
2017-03-02 21:39:45,246 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:45,440 INFO stram.StreamingContainerManager
(StreamingContainerManager.java:processHeartbeat(1459)) - Container
container_e3076_1487635160678_27652_01_001540 buffer server:
brdn1164.target.com:50773
2017-03-02 21:39:46,251 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:47,255 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:48,259 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:49,262 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:50,266 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:51,270 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:52,280 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:53,285 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:54,288 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:55,292 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:56,295 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:57,299 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:58,311 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:59,316 WARN stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
Thanks Again for reply
Thanks
Rishi Mishra
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/Apex-application-getting-killed-at-regular-interval-tp1421p1424.html
Sent from the Apache Apex Users list mailing list archive at Nabble.com.