This error is usually harmless - it happens when application is being stopped 
(slider stop cl1) and Slider Agents may still be heartbeating with the 
AppMaster.

<snip>
> impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>         at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>         at
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
</snip>

What is not implemented is an explicit call to "stop function in the python 
scripts".

What I was referring to that an attempt is made by the Agent to call stop in 
the python script but it is not guaranteed. The reason it is not guaranteed is 
that the call to stop() and kill of the containers by YARN is not co-ordinated.

In summary, the ability to call stop() functions in the python script is not 
implemented. Its in the plan though.

________________________________________
From: Ted Yu <[email protected]>
Sent: Saturday, March 14, 2015 8:52 AM
To: [email protected]
Subject: Re: Apache Slider stop function not working

Kishore:
Looks like logging was at INFO level.
Do you mind turning on DEBUG logging ?

Thanks

On Sat, Mar 14, 2015 at 7:39 AM, Krishna Kishore Bonagiri <
[email protected]> wrote:

> Hi Steve,
>
>   This is what I see in the AM's log since the STOP command is issued. Even
> though it indicates that STOP command SUCCEEDED, I see that the stop
> function in my python script is not getting executed. Does the exception at
> the end of this log indicate something?
>
> 2015-03-14 07:24:01,202 [IPC Server handler 2 on 39387] INFO
> appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
> command issued:  exit code = 0, SUCCEEDED: stop command issued;
> 2015-03-14 07:24:02,202 [AmExecutor-006] INFO
> appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
> command issued
> 2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
> Triggering shutdown of the AM: stop command issued:  exit code = 0,
> SUCCEEDED: stop command issued;
> 2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
> Process has exited with exit code 0 mapped to 0 -ignoring
> 2015-03-14 07:24:02,202 [main] INFO  workflow.WorkflowCompositeService
> - Child service completed Service RoleLaunchService in state
> RoleLaunchService: STOPPED
> 2015-03-14 07:24:02,202 [main] INFO  state.AppState - Releasing 2
> containers
> 2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
> container. Log:
>
> http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1395.svl.ibm.com:45454/container_1425452295813_0123_01_000002/ctx/bigsql
> 2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
> container. Log:
>
> http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1396.svl.ibm.com:45454/container_1425452295813_0123_01_000003/ctx/bigsql
> 2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
> Application completed. Signalling finish to RM
> 2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
> Unregistering AM status=SUCCEEDED message=stop command issued
> 2015-03-14 07:24:02,209 [main] INFO  impl.AMRMClientImpl - Waiting for
> application to be successfully unregistered.
> 2015-03-14 07:24:02,310 [main] INFO  appmaster.SliderAppMaster -
> Exiting AM; final exit code = 0
> 2015-03-14 07:24:02,312 [main] INFO  util.ExitUtil - Exiting with status 0
> 2015-03-14 07:24:02,326 [Shutdown] INFO  mortbay.log - Shutdown hook
> executing
> 2015-03-14 07:24:02,343 [Shutdown] INFO  mortbay.log - Stopped
> [email protected]:45840
> 2015-03-14 07:24:02,354 [Thread-1] INFO  mortbay.log - Stopped
> [email protected]:0
> 2015-03-14 07:24:02,355 [Shutdown] INFO  mortbay.log - Stopped
> [email protected]:48056
> 2015-03-14 07:24:02,358 [Shutdown] INFO  mortbay.log - Shutdown hook
> complete
> 2015-03-14 07:24:02,364 [Thread-1] INFO  ipc.Server - Stopping server on
> 39387
> 2015-03-14 07:24:02,365 [IPC Server listener on 39387] INFO
> ipc.Server - Stopping IPC Server listener on 39387
> 2015-03-14 07:24:02,366 [IPC Server Responder] INFO  ipc.Server -
> Stopping IPC Server Responder
> 2015-03-14 07:24:02,367 [Thread-1] INFO
> impl.ContainerManagementProtocolProxy - Opening proxy :
> bdvs1395.svl.ibm.com:45454
> 2015-03-14 07:24:02,383 [Thread-1] INFO
> impl.ContainerManagementProtocolProxy - Opening proxy :
> bdvs1396.svl.ibm.com:45454
> 2015-03-14 07:24:02,429 [AMRM Callback Handler Thread] INFO
> impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>         at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>         at
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
> 2015-03-14 07:24:02,432 [AmExecutor-005] INFO  actions.QueueService -
> QueueService processor terminated
> 2015-03-14 07:24:02,432 [AmExecutor-006] WARN  actions.ActionStopQueue -
> STOP
> 2015-03-14 07:24:02,432 [AmExecutor-006] INFO  actions.QueueExecutor -
> Queue Executor run() stopped
>
>
> Thanks,
>
> Kishore
>
>
>
> On Sat, Mar 14, 2015 at 7:28 PM, Steve Loughran <[email protected]>
> wrote:
>
> >
> > Sorry, I think we've been creating confusion
> >
> > Sumit was referring to the fact that in the app-specific python scripts
> > inside an app package, there's a stop operation which isn't implemented;
> > the specific component instances currently get destroyed without warning
> > when the slider AM hands back the containers to YARN.
> >
> > The CLI "stop" operation is very much supported, and it should work.
> >
> > 1. The basic "slider stop cl1" operation is meant to find the running
> > application and ask it to shut down. If this doesn't work, can we see (a)
> > any stack trace on the client and (b) the tail end of the AM logs.
> >
> > 2. "slider stop cl1 --force" skips talking to the slider AM and talks to
> > YARN direct. No matter what's going on inside the application, this will
> > kill it. If it doesn't, there's something gone wrong on the client side
> > about talking to YARN, or something very very wrong with the YARN system
> > itself. Again, a client-side log will help us review this
> >
> > -steve
> >
> >
> > > On 14 Mar 2015, at 07:09, Krishna Kishore Bonagiri <
> > [email protected]> wrote:
> > >
> > > Hi Sumit,
> > > First of all thanks for the reply.
> > >
> > > What we have been trying is this kind of command from CLI.
> > >  slider stop cl1
> > >
> > >  So, as you are saying it doesn't yet work. But what is the other way
> to
> > > stop the application? What do you mean by "The only time stop is
> called,
> > > today, is when the application is stopped the Slider Agents call Stop"?
> > >
> > > Kishore
> > >
> > > On Sat, Mar 14, 2015 at 10:56 AM, Sumit Mohanty <
> [email protected]
> > >
> > > wrote:
> > >
> > >> Stop is not wired up to the Stop command from the CLI. The only time
> > stop
> > >> is called, today, is when the application is stopped the Slider Agents
> > call
> > >> Stop and wait for ~10 seconds before killing the processes.
> > >>
> > >> On Fri, Mar 13, 2015 at 8:05 PM, Krishna Kishore Bonagiri <
> > >> [email protected]> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>>  We are using Apache Slider 0.60 and implemented the management
> > >> operations
> > >>> start, status, stop, etc. in python script. Everything else is
> working
> > >> but
> > >>> the stop function is not getting invoked when the container is
> stopped.
> > >> Is
> > >>> this a known issue already? or is there any trick to make it work?
> > >>>
> > >>>
> > >>> Thanks,
> > >>> Kishore
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> thanks
> > >> Sumit
> > >>
> >
> >
>

Reply via email to