Hi, Chuanlei. Could you share your Storm version? And is it reproducible? I'd like to see thread dump while supervisor and workers are freezing. I can't find out your issue deeply with only symptom.
Thanks, Jungtaek Lim (HeartSaVioR) 2015-07-10 13:36 GMT+09:00 Chuanlei Ni <[email protected]>: > Maybe the supervisor can launch the workers, but after a few seconds the > workers are died. > > 2015-07-10 12:16 GMT+08:00 Chuanlei Ni <[email protected]>: > >> Hi, >> >> I got an error while using storm. >> I turned the time of my machine back an hour, the supervisor and >> workers are all suspend( they don't print the log any more but the process >> is still alive).Maybe it because the zk session expired, when i restart >> the supervisor, it will kill the suspended workers. but the supervisor >> cannot launch the worker. It is weird. >> >> the log is as below >> 2015-07-10 12:06:09,856 util=[INFO] Touching file at >> /home/storm/storm/workers/165e1c3c-0ce4-42f4-8e64-4f949c413fb5/cglimitpids/8862 >> 2015-07-10 12:06:09,856 Utils=[ERROR] [ChildProcess stderr reader: >> 165e1c3c-0ce4-42f4-8e64-4f949c413fb5 - 8862] starts >> 2015-07-10 12:06:09,856 Utils=[ERROR] [ChildProcess monitor: >> 165e1c3c-0ce4-42f4-8e64-4f949c413fb5 - 8862] starts >> 2015-07-10 12:06:09,856 supervisor=[INFO] the worker id >> 165e1c3c-0ce4-42f4-8e64-4f949c413fb5 pid 8862 >> 2015-07-10 12:06:09,857 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:10,357 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:10,857 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:11,358 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:11,858 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:12,358 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:12,859 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:13,359 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:13,859 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:14,360 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:14,860 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:15,360 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> 2015-07-10 12:06:15,861 supervisor=[INFO] >> 81122780-afa1-4465-8d8c-65084dae5be7 still hasn't started >> >> My questions are: >> 1. why the supervisor and the workers are suspended? >> 2. when i restart the supervisor, why it cannot launch the workers? >> >> Thx in advance. >> > > -- Name : 임 정택 Blog : http://www.heartsavior.net / http://dev.heartsavior.net Twitter : http://twitter.com/heartsavior LinkedIn : http://www.linkedin.com/in/heartsavior
