Hi all,
In the document, I got that slider will try to recover on container
failure. But in my test application it doesn't
I'm using the 0.40 release. build from source
Here is what I get.
*If I kill the child process *
The agent check_process_status method will raise ComponentIsNotRunning and
never get back again
2014-11-14 16:18:40,274 - Error while executing command 'status':
Traceback (most recent call last):
File
"/yarn/nm/usercache/vagrant/appcache/application_1415305968048_0008/container_1415305968048_0008_01_000002/infra/agent/slider-agent/resource_management/libraries/script/script.py",
line 114, in execute
method(env)
File
"/yarn/nm/usercache/vagrant/appcache/application_1415305968048_0008/container_1415305968048_0008_01_000002/app/definition/package/scripts/kafka.py",
line 60, in status
check_process_status(status_params.pid_file)
File
"/yarn/nm/usercache/vagrant/appcache/application_1415305968048_0008/container_1415305968048_0008_01_000002/infra/agent/slider-agent/resource_management/libraries/functions/check_process_status.py",
line 45, in check_process_status
raise ComponentIsNotRunning()
ComponentIsNotRunning
*If I kill the agent process*, it's just silently gone
Best,
Siyuan