Hi Rishi, The FSEventRecorder exception will be ignored by the platform, it should not kill the application.
2017-03-02 21:39:43,238 WARN stram.StreamingContainerManager (StreamingContainerManager.java:calculateEndWindowStats(823)) - Some operators are behind for more than 1000 windows! Trimming the end window stats map This shows that either you have slow operators in your dag, or some down-stream operator is failing continuously. Continuous operator failures can shutdown the application. Do you see "Shutdown after reaching failure threshold for" message in the log. - Tushar. On Fri, Mar 10, 2017 at 12:18 PM, rishimishra <[email protected]> wrote: > hi , > > I am running one of the Apex Application , which takes input from kafka and > other operator parse the input and finally write in HDFS . Do not know > exactly why it is getting killed/failed sometimes after few hrs or after > few days . Below is the error , which I see in dt.log file:- > 2017-03-02 21:39:42,230 INFO impl.ContainerManagementProtocolProxy > (ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy : > brdn1164.target.com:45454 > 2017-03-02 21:39:42,231 ERROR stram.FSEventRecorder > (FSEventRecorder.java:run(85)) - Caught Exception > java.io.IOException: Connection is not open > at > com.datatorrent.stram.util.PubSubWebSocketClient.assertUsable( > PubSubWebSocketClient.java:264) > at > com.datatorrent.stram.util.PubSubWebSocketClient.publish( > PubSubWebSocketClient.java:287) > at > com.datatorrent.stram.util.SharedPubSubWebSocketClient.publish( > SharedPubSubWebSocketClient.java:120) > at > com.datatorrent.stram.FSEventRecorder$EventRecorderThread.run( > FSEventRecorder.java:79) > 2017-03-02 21:39:42,234 WARN stram.StreamingContainerManager > (StreamingContainerManager.java:calculateEndWindowStats(823)) - Some > operators are behind for more than 1000 windows! Trimming the end window > stats map > 2017-03-02 21:39:43,238 WARN stram.StreamingContainerManager > (StreamingContainerManager.java:calculateEndWindowStats(823)) - Some > operators are behind for more than 1000 windows! Trimming the end window > stats map > 2017-03-02 21:39:44,122 INFO ipc.Server (Server.java:saslProcess(1538)) - > Auth successful for SVFFLHDS (auth:TOKEN) > > > Looks like container not sending the heartbeat to App Master , just a > guess. > If some body faced this error please share your insights. > > Thanks > Rishi Mishra > > > > -- > View this message in context: http://apache-apex-users-list. > 78494.x6.nabble.com/Apex-application-getting-killed-at- > regular-interval-tp1421.html > Sent from the Apache Apex Users list mailing list archive at Nabble.com. >
