Hi, After killing the app, all the containers don’t hang, all the processes were gone, so I can’t make a thread dump.
Here I only need to write data to HDFS, so my implementation of SystemFactory return null for both getConsumer and getAdmin method, only provide a valid SystemProducer. Is this a problem? Thanks. ———————— 舒琦 地址:长沙市岳麓区文轩路27号麓谷企业广场A4栋1单元6F 网址:http://www.eefung.com 微博:http://weibo.com/eefung 邮编:410013 电话:400-677-0986 传真:0731-88519609 > 在 2017年1月18日,01:59,Yi Pan <nickpa...@gmail.com> 写道: > > You probably should return a valid SystemAdmin object, but returning null > for SystemConsumer should be OK. Again, two questions: > 1) Did the container hangs during the shutdown? Or it just crashes w/ > exception? Since stderr does not show anything, I was assuming that the > container hangs??? > 2) If the container hangs, could you take a thread dump? > > Thanks! > > -Yi > > On Tue, Jan 17, 2017 at 1:50 AM, 舒琦 <sh...@eefung.com > <mailto:sh...@eefung.com>> wrote: > >> Hi, >> >> My SystemFactory implementation return null for both 『getConsumer』 and >> 『getAdmin』, is this the cause of the problem? >> >> Thanks. >> >> ———————— >> 舒琦 >> 地址:长沙市岳麓区文轩路27号麓谷企业广场A4栋1单元6F >> 网址:http://www.eefung.com >> 微博:http://weibo.com/eefung >> 邮编:410013 >> 电话:400-677-0986 >> 传真:0731-88519609 >> >>> 在 2017年1月17日,17:18,Yi Pan <nickpa...@gmail.com> 写道: >>> >>> Hi, Qi, >>> >>> In your log, the log line stops at "closing simple consumer...". It is >> part of the shutdownConsumers() method in the shutdown sequence. Are you >> sure that the container process actually proceed further in the shutdown >> sequence? If the container process does not proceed further (i.e. somehow >> stuck at certain steps before shutdownProducers() method), your producer >> stop() method will not be executed. I noticed that in your log file, there >> is not even a line "Shutting down task instance stream tasks.", which means >> your program does not even executed shutdownTasks() in the shutdown >> sequence (right after the shutdownConsumers()). Since in your stderr, there >> is no exception reported either, can you check your implementation of >> HStoreSystemConsumer to see whether the consumer hangs on shutdown? A >> thread-dump would be super helpful here. >>> >>> On Sun, Jan 15, 2017 at 11:30 PM, 舒琦 <sh...@eefung.com >>> <mailto:sh...@eefung.com> <mailto: >> sh...@eefung.com <mailto:sh...@eefung.com>>> wrote: >>> Hi, >>> >>> Thanks for your help. >>> >>> Here are 2 questions: >>> >>> 1. I have defined my own HDFS producer which implemented SystemProducer >> and overwrite stop method(I log something in the first line of stop >> method), but when I kill the app, the log are not printed out. The tricky >> thing is the logic defined in stop method sometimes can be executed and >> sometimes not. >>> >>> Below is stop method: >>> >>> @Override >>> public void stop() { >>> try { >>> LOGGER.info <http://logger.info/>("Begin to close files"); >>> closeFiles(); >>> } catch (IOException e) { >>> LOGGER.error("Error when close Files", e); >>> } >>> >>> if (fs != null) { >>> try { >>> fs.close(); >>> } catch (IOException e) { >>> //do nothing >>> } >>> } >>> } >>> >>> Below is the log: >>> >>> 2017-01-16 15:13:35.273 [Thread-9] SamzaContainer [INFO] Shutting down, >> will wait up to 5000 ms >>> 2017-01-16 15:13:35.284 [main] SamzaContainer [INFO] Shutting down. >>> 2017-01-16 15:13:35.285 [main] SamzaContainer [INFO] Shutting down >> consumer multiplexer. >>> 2017-01-16 15:13:35.287 [main] BrokerProxy [INFO] Shutting down >> BrokerProxy for 172.19.105.22:9096 <http://172.19.105.22:9096/ >> <http://172.19.105.22:9096/>> >>> 2017-01-16 15:13:35.288 [main] BrokerProxy [INFO] closing simple >> consumer... >>> 2017-01-16 15:13:35.340 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed >> at 172.19.105.22:9096 <http://172.19.105.22:9096/ >> <http://172.19.105.22:9096/>> for client >> samza_consumer-canal_status_persistent_hstore-1] BrokerProxy [INFO] Got >> interrupt exception in broker proxy thread. >>> 2017-01-16 15:13:35.340 [main] BrokerProxy [INFO] Shutting down >> BrokerProxy for 172.19.105.21:9096 <http://172.19.105.21:9096/ >> <http://172.19.105.21:9096/>> >>> 2017-01-16 15:13:35.341 [main] BrokerProxy [INFO] closing simple >> consumer… >>> >>> You can see the log “Begin to close files” are not printed out and of >> course the logic is not executed. >>> >>> 2. The hadoop cluster I use is “HDP-2.5.0”,the log aggregation is also >> enabled, but logs of containers can not be collected, only the log of am >> can be seen. >>> >>> >>> >>> >>> ———————— >>> ShuQi >>> >>>> 在 2017年1月16日,10:39,Liu Bo <diabl...@gmail.com <mailto:diabl...@gmail.com> >>>> <mailto: >> diabl...@gmail.com <mailto:diabl...@gmail.com>>> 写道: >>>> >>>> Hi, >>>> >>>> *container log will be removed automatically,* >>>> >>>> you can turn on yarn log aggregation, so that terminated yarn jobs' log >>>> will be dumped to HDFS >>>> >>>> On 14 January 2017 at 07:44, Yi Pan <nickpa...@gmail.com >>>> <mailto:nickpa...@gmail.com> <mailto: >> nickpa...@gmail.com <mailto:nickpa...@gmail.com>>> wrote: >>>> >>>>> Hi, Qi, >>>>> >>>>> Sorry to reply late. I am curious on your comment that the close and >> stop >>>>> methods are not called. When user initiated a kill request, the >> graceful >>>>> shutdown sequence is triggered by the shutdown hook added to >>>>> SamzaContainer. The shutdown sequence is the following in the code: >>>>> {code} >>>>> info("Shutting down.") >>>>> >>>>> shutdownConsumers >>>>> shutdownTask >>>>> shutdownStores >>>>> shutdownDiskSpaceMonitor >>>>> shutdownHostStatisticsMonitor >>>>> shutdownProducers >>>>> shutdownLocalityManager >>>>> shutdownOffsetManager >>>>> shutdownMetrics >>>>> shutdownSecurityManger >>>>> >>>>> info("Shutdown complete.") >>>>> {code} >>>>> >>>>> in which, MessageChooser.stop() is invoked in shutdownConsumers, and >>>>> SystemProducer.close() is invoked in shutdownProducers. >>>>> >>>>> Could you explain why you are not able to shutdown a Samza job >> gracefully? >>>>> >>>>> Thanks! >>>>> >>>>> -Yi >>>>> >>>>> On Mon, Dec 12, 2016 at 6:33 PM, 舒琦 <sh...@eefung.com >>>>> <mailto:sh...@eefung.com> <mailto: >> sh...@eefung.com <mailto:sh...@eefung.com>>> wrote: >>>>> >>>>>> Hi Guys, >>>>>> >>>>>> How can I stop running samza job gracefully except killing it? >>>>>> >>>>>> Because when samza job was killed, the close and stop method in >>>>>> BaseMessageChooser and SystemProducer will not be called and the >>>>> container >>>>>> log will be removed automatically, how can resolve this? >>>>>> >>>>>> Thanks. >>>>>> >>>>>> ———————— >>>>>> ShuQi >>>>> >>>> >>>> >>>> >>>> -- >>>> All the best >>>> >>>> Liu Bo