hi ,

I am running one of the Apex Application , which takes input from kafka and
other operator parse the input and finally write in HDFS . Do not know
exactly why it  is getting killed/failed sometimes after few hrs or after
few days . Below is the error , which I see in dt.log file:-
2017-03-02 21:39:42,230 INFO  impl.ContainerManagementProtocolProxy
(ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy :
brdn1164.target.com:45454
2017-03-02 21:39:42,231 ERROR stram.FSEventRecorder
(FSEventRecorder.java:run(85)) - Caught Exception
java.io.IOException: Connection is not open
        at
com.datatorrent.stram.util.PubSubWebSocketClient.assertUsable(PubSubWebSocketClient.java:264)
        at
com.datatorrent.stram.util.PubSubWebSocketClient.publish(PubSubWebSocketClient.java:287)
        at
com.datatorrent.stram.util.SharedPubSubWebSocketClient.publish(SharedPubSubWebSocketClient.java:120)
        at
com.datatorrent.stram.FSEventRecorder$EventRecorderThread.run(FSEventRecorder.java:79)
2017-03-02 21:39:42,234 WARN  stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:43,238 WARN  stram.StreamingContainerManager
(StreamingContainerManager.java:calculateEndWindowStats(823)) - Some
operators are behind for more than 1000 windows! Trimming the end window
stats map
2017-03-02 21:39:44,122 INFO  ipc.Server (Server.java:saslProcess(1538)) -
Auth successful for SVFFLHDS (auth:TOKEN)


Looks like container not sending the heartbeat to App Master , just a guess.
If some body faced this error please share your insights.

Thanks
Rishi Mishra



--
View this message in context: 
http://apache-apex-users-list.78494.x6.nabble.com/Apex-application-getting-killed-at-regular-interval-tp1421.html
Sent from the Apache Apex Users list mailing list archive at Nabble.com.

Reply via email to