Hi Steve,

  This is what I see in the AM's log since the STOP command is issued. Even
though it indicates that STOP command SUCCEEDED, I see that the stop
function in my python script is not getting executed. Does the exception at
the end of this log indicate something?

2015-03-14 07:24:01,202 [IPC Server handler 2 on 39387] INFO
appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
command issued:  exit code = 0, SUCCEEDED: stop command issued;
2015-03-14 07:24:02,202 [AmExecutor-006] INFO
appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
command issued
2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
Triggering shutdown of the AM: stop command issued:  exit code = 0,
SUCCEEDED: stop command issued;
2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
Process has exited with exit code 0 mapped to 0 -ignoring
2015-03-14 07:24:02,202 [main] INFO  workflow.WorkflowCompositeService
- Child service completed Service RoleLaunchService in state
RoleLaunchService: STOPPED
2015-03-14 07:24:02,202 [main] INFO  state.AppState - Releasing 2 containers
2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
container. Log:
http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1395.svl.ibm.com:45454/container_1425452295813_0123_01_000002/ctx/bigsql
2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
container. Log:
http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1396.svl.ibm.com:45454/container_1425452295813_0123_01_000003/ctx/bigsql
2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
Application completed. Signalling finish to RM
2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
Unregistering AM status=SUCCEEDED message=stop command issued
2015-03-14 07:24:02,209 [main] INFO  impl.AMRMClientImpl - Waiting for
application to be successfully unregistered.
2015-03-14 07:24:02,310 [main] INFO  appmaster.SliderAppMaster -
Exiting AM; final exit code = 0
2015-03-14 07:24:02,312 [main] INFO  util.ExitUtil - Exiting with status 0
2015-03-14 07:24:02,326 [Shutdown] INFO  mortbay.log - Shutdown hook executing
2015-03-14 07:24:02,343 [Shutdown] INFO  mortbay.log - Stopped
[email protected]:45840
2015-03-14 07:24:02,354 [Thread-1] INFO  mortbay.log - Stopped
[email protected]:0
2015-03-14 07:24:02,355 [Shutdown] INFO  mortbay.log - Stopped
[email protected]:48056
2015-03-14 07:24:02,358 [Shutdown] INFO  mortbay.log - Shutdown hook complete
2015-03-14 07:24:02,364 [Thread-1] INFO  ipc.Server - Stopping server on 39387
2015-03-14 07:24:02,365 [IPC Server listener on 39387] INFO
ipc.Server - Stopping IPC Server listener on 39387
2015-03-14 07:24:02,366 [IPC Server Responder] INFO  ipc.Server -
Stopping IPC Server Responder
2015-03-14 07:24:02,367 [Thread-1] INFO
impl.ContainerManagementProtocolProxy - Opening proxy :
bdvs1395.svl.ibm.com:45454
2015-03-14 07:24:02,383 [Thread-1] INFO
impl.ContainerManagementProtocolProxy - Opening proxy :
bdvs1396.svl.ibm.com:45454
2015-03-14 07:24:02,429 [AMRM Callback Handler Thread] INFO
impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
java.lang.InterruptedException
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
2015-03-14 07:24:02,432 [AmExecutor-005] INFO  actions.QueueService -
QueueService processor terminated
2015-03-14 07:24:02,432 [AmExecutor-006] WARN  actions.ActionStopQueue - STOP
2015-03-14 07:24:02,432 [AmExecutor-006] INFO  actions.QueueExecutor -
Queue Executor run() stopped


Thanks,

Kishore



On Sat, Mar 14, 2015 at 7:28 PM, Steve Loughran <[email protected]>
wrote:

>
> Sorry, I think we've been creating confusion
>
> Sumit was referring to the fact that in the app-specific python scripts
> inside an app package, there's a stop operation which isn't implemented;
> the specific component instances currently get destroyed without warning
> when the slider AM hands back the containers to YARN.
>
> The CLI "stop" operation is very much supported, and it should work.
>
> 1. The basic "slider stop cl1" operation is meant to find the running
> application and ask it to shut down. If this doesn't work, can we see (a)
> any stack trace on the client and (b) the tail end of the AM logs.
>
> 2. "slider stop cl1 --force" skips talking to the slider AM and talks to
> YARN direct. No matter what's going on inside the application, this will
> kill it. If it doesn't, there's something gone wrong on the client side
> about talking to YARN, or something very very wrong with the YARN system
> itself. Again, a client-side log will help us review this
>
> -steve
>
>
> > On 14 Mar 2015, at 07:09, Krishna Kishore Bonagiri <
> [email protected]> wrote:
> >
> > Hi Sumit,
> > First of all thanks for the reply.
> >
> > What we have been trying is this kind of command from CLI.
> >  slider stop cl1
> >
> >  So, as you are saying it doesn't yet work. But what is the other way to
> > stop the application? What do you mean by "The only time stop is called,
> > today, is when the application is stopped the Slider Agents call Stop"?
> >
> > Kishore
> >
> > On Sat, Mar 14, 2015 at 10:56 AM, Sumit Mohanty <[email protected]
> >
> > wrote:
> >
> >> Stop is not wired up to the Stop command from the CLI. The only time
> stop
> >> is called, today, is when the application is stopped the Slider Agents
> call
> >> Stop and wait for ~10 seconds before killing the processes.
> >>
> >> On Fri, Mar 13, 2015 at 8:05 PM, Krishna Kishore Bonagiri <
> >> [email protected]> wrote:
> >>
> >>> Hi,
> >>>
> >>>  We are using Apache Slider 0.60 and implemented the management
> >> operations
> >>> start, status, stop, etc. in python script. Everything else is working
> >> but
> >>> the stop function is not getting invoked when the container is stopped.
> >> Is
> >>> this a known issue already? or is there any trick to make it work?
> >>>
> >>>
> >>> Thanks,
> >>> Kishore
> >>>
> >>
> >>
> >>
> >> --
> >> thanks
> >> Sumit
> >>
>
>

Reply via email to