I think this is a bug that was fixed in the latest version of Slider, 0.92.
I didn't figure out why this happens sometimes, but it seems to be resolved
by changing the following:
diff --git a/slider-agent/src/main/python/agent/ActionQueue.py
b/slider-agent/src/main/python/agent/ActionQueue.py
index 7514337..e973337 100644
--- a/slider-agent/src/main/python/agent/ActionQueue.py
+++ b/slider-agent/src/main/python/agent/ActionQueue.py
@@ -161,7 +161,7 @@ class ActionQueue(threading.Thread):
self.commandStatuses.put_command_status(command, in_progress_status,
reportResult)
store_config = False
- if ActionQueue.STORE_APPLIED_CONFIG in command['commandParams']:
+ if 'commandParams' in command and ActionQueue.STORE_APPLIED_CONFIG in
command['commandParams']:
store_config = 'true' ==
command['commandParams'][ActionQueue.STORE_APPLIED_CONFIG]
store_command = False
if 'roleParams' in command and command['roleParams'] is not None and
ActionQueue.AUTO_RESTART in command['roleParams']:
On Thu, Mar 23, 2017 at 6:25 PM, David.Serafini <[email protected]>
wrote:
> Can anyone tell me what this error means and whether it is significant?
> I have a slider job that seems to randomly fail, and I don't see anything
> interesting in the AppMaster logs except this. (That doesn't mean there
> isn't an error elsewhere: yarn is wiping out the job directories as soon as
> the containter terminates: I haven't figured out how to fix that).
>
> In case it matters, my job is a shell script specified in metainfo.json
> in application.components.commands.exec . The script does some setup
> and then runs tomcat.
>
> thanks in advance,
> david
>
>
> Connecting to the server at https://brdn1088.target.com:
> 42721/ws/v1/slider/agents/...
> Registered with the server
> Traceback (most recent call last):
> File "./infra/agent/slider-agent/agent/main.py", line 318, in <module>
> main()
> File "./infra/agent/slider-agent/agent/main.py", line 311, in main
> controller.join(timeout=1.0)
> File "/usr/lib64/python2.6/threading.py", line 655, in join
> self.__block.wait(delay)
> File "/usr/lib64/python2.6/threading.py", line 258, in wait
> _sleep(delay)
> File "./infra/agent/slider-agent/agent/main.py", line 66, in
> signal_handler
> controller.actionQueue.execute_command(controller.stopCommand)
> File "/grid/4/hadoop/yarn/local/usercache/Z002JSF/appcache/
> application_1490038663882_9176/filecache/11/slider-
> agent.tar.gz/slider-agent/agent/ActionQueue.py", line 164, in
> execute_command
> if ActionQueue.STORE_APPLIED_CONFIG in command['commandParams']:
> KeyError: 'commandParams'
>
>
>