Thank you Sumit.

On Sat, Mar 14, 2015 at 9:51 PM, Sumit Mohanty <[email protected]>
wrote:

> This error is usually harmless - it happens when application is being
> stopped (slider stop cl1) and Slider Agents may still be heartbeating with
> the AppMaster.
>
> <snip>
> > impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
> >         at
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> >         at
> >
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
> </snip>
>
> What is not implemented is an explicit call to "stop function in the
> python scripts".
>
> What I was referring to that an attempt is made by the Agent to call stop
> in the python script but it is not guaranteed. The reason it is not
> guaranteed is that the call to stop() and kill of the containers by YARN is
> not co-ordinated.
>
> In summary, the ability to call stop() functions in the python script is
> not implemented. Its in the plan though.
>
> ________________________________________
> From: Ted Yu <[email protected]>
> Sent: Saturday, March 14, 2015 8:52 AM
> To: [email protected]
> Subject: Re: Apache Slider stop function not working
>
> Kishore:
> Looks like logging was at INFO level.
> Do you mind turning on DEBUG logging ?
>
> Thanks
>
> On Sat, Mar 14, 2015 at 7:39 AM, Krishna Kishore Bonagiri <
> [email protected]> wrote:
>
> > Hi Steve,
> >
> >   This is what I see in the AM's log since the STOP command is issued.
> Even
> > though it indicates that STOP command SUCCEEDED, I see that the stop
> > function in my python script is not getting executed. Does the exception
> at
> > the end of this log indicate something?
> >
> > 2015-03-14 07:24:01,202 [IPC Server handler 2 on 39387] INFO
> > appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
> > command issued:  exit code = 0, SUCCEEDED: stop command issued;
> > 2015-03-14 07:24:02,202 [AmExecutor-006] INFO
> > appmaster.SliderAppMaster - SliderAppMasterApi.stopCluster: stop
> > command issued
> > 2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
> > Triggering shutdown of the AM: stop command issued:  exit code = 0,
> > SUCCEEDED: stop command issued;
> > 2015-03-14 07:24:02,202 [main] INFO  appmaster.SliderAppMaster -
> > Process has exited with exit code 0 mapped to 0 -ignoring
> > 2015-03-14 07:24:02,202 [main] INFO  workflow.WorkflowCompositeService
> > - Child service completed Service RoleLaunchService in state
> > RoleLaunchService: STOPPED
> > 2015-03-14 07:24:02,202 [main] INFO  state.AppState - Releasing 2
> > containers
> > 2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
> > container. Log:
> >
> >
> http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1395.svl.ibm.com:45454/container_1425452295813_0123_01_000002/ctx/bigsql
> > 2015-03-14 07:24:02,203 [main] INFO  state.AppState - Releasing
> > container. Log:
> >
> >
> http://bdvs1395.svl.ibm.com:19888/jobhistory/logs/bdvs1396.svl.ibm.com:45454/container_1425452295813_0123_01_000003/ctx/bigsql
> > 2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
> > Application completed. Signalling finish to RM
> > 2015-03-14 07:24:02,204 [main] INFO  appmaster.SliderAppMaster -
> > Unregistering AM status=SUCCEEDED message=stop command issued
> > 2015-03-14 07:24:02,209 [main] INFO  impl.AMRMClientImpl - Waiting for
> > application to be successfully unregistered.
> > 2015-03-14 07:24:02,310 [main] INFO  appmaster.SliderAppMaster -
> > Exiting AM; final exit code = 0
> > 2015-03-14 07:24:02,312 [main] INFO  util.ExitUtil - Exiting with status
> 0
> > 2015-03-14 07:24:02,326 [Shutdown] INFO  mortbay.log - Shutdown hook
> > executing
> > 2015-03-14 07:24:02,343 [Shutdown] INFO  mortbay.log - Stopped
> > [email protected]:45840
> > 2015-03-14 07:24:02,354 [Thread-1] INFO  mortbay.log - Stopped
> > [email protected]:0
> > 2015-03-14 07:24:02,355 [Shutdown] INFO  mortbay.log - Stopped
> > [email protected]:48056
> > 2015-03-14 07:24:02,358 [Shutdown] INFO  mortbay.log - Shutdown hook
> > complete
> > 2015-03-14 07:24:02,364 [Thread-1] INFO  ipc.Server - Stopping server on
> > 39387
> > 2015-03-14 07:24:02,365 [IPC Server listener on 39387] INFO
> > ipc.Server - Stopping IPC Server listener on 39387
> > 2015-03-14 07:24:02,366 [IPC Server Responder] INFO  ipc.Server -
> > Stopping IPC Server Responder
> > 2015-03-14 07:24:02,367 [Thread-1] INFO
> > impl.ContainerManagementProtocolProxy - Opening proxy :
> > bdvs1395.svl.ibm.com:45454
> > 2015-03-14 07:24:02,383 [Thread-1] INFO
> > impl.ContainerManagementProtocolProxy - Opening proxy :
> > bdvs1396.svl.ibm.com:45454
> > 2015-03-14 07:24:02,429 [AMRM Callback Handler Thread] INFO
> > impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
> >         at
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> >         at
> >
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
> > 2015-03-14 07:24:02,432 [AmExecutor-005] INFO  actions.QueueService -
> > QueueService processor terminated
> > 2015-03-14 07:24:02,432 [AmExecutor-006] WARN  actions.ActionStopQueue -
> > STOP
> > 2015-03-14 07:24:02,432 [AmExecutor-006] INFO  actions.QueueExecutor -
> > Queue Executor run() stopped
> >
> >
> > Thanks,
> >
> > Kishore
> >
> >
> >
> > On Sat, Mar 14, 2015 at 7:28 PM, Steve Loughran <[email protected]>
> > wrote:
> >
> > >
> > > Sorry, I think we've been creating confusion
> > >
> > > Sumit was referring to the fact that in the app-specific python scripts
> > > inside an app package, there's a stop operation which isn't
> implemented;
> > > the specific component instances currently get destroyed without
> warning
> > > when the slider AM hands back the containers to YARN.
> > >
> > > The CLI "stop" operation is very much supported, and it should work.
> > >
> > > 1. The basic "slider stop cl1" operation is meant to find the running
> > > application and ask it to shut down. If this doesn't work, can we see
> (a)
> > > any stack trace on the client and (b) the tail end of the AM logs.
> > >
> > > 2. "slider stop cl1 --force" skips talking to the slider AM and talks
> to
> > > YARN direct. No matter what's going on inside the application, this
> will
> > > kill it. If it doesn't, there's something gone wrong on the client side
> > > about talking to YARN, or something very very wrong with the YARN
> system
> > > itself. Again, a client-side log will help us review this
> > >
> > > -steve
> > >
> > >
> > > > On 14 Mar 2015, at 07:09, Krishna Kishore Bonagiri <
> > > [email protected]> wrote:
> > > >
> > > > Hi Sumit,
> > > > First of all thanks for the reply.
> > > >
> > > > What we have been trying is this kind of command from CLI.
> > > >  slider stop cl1
> > > >
> > > >  So, as you are saying it doesn't yet work. But what is the other way
> > to
> > > > stop the application? What do you mean by "The only time stop is
> > called,
> > > > today, is when the application is stopped the Slider Agents call
> Stop"?
> > > >
> > > > Kishore
> > > >
> > > > On Sat, Mar 14, 2015 at 10:56 AM, Sumit Mohanty <
> > [email protected]
> > > >
> > > > wrote:
> > > >
> > > >> Stop is not wired up to the Stop command from the CLI. The only time
> > > stop
> > > >> is called, today, is when the application is stopped the Slider
> Agents
> > > call
> > > >> Stop and wait for ~10 seconds before killing the processes.
> > > >>
> > > >> On Fri, Mar 13, 2015 at 8:05 PM, Krishna Kishore Bonagiri <
> > > >> [email protected]> wrote:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>>  We are using Apache Slider 0.60 and implemented the management
> > > >> operations
> > > >>> start, status, stop, etc. in python script. Everything else is
> > working
> > > >> but
> > > >>> the stop function is not getting invoked when the container is
> > stopped.
> > > >> Is
> > > >>> this a known issue already? or is there any trick to make it work?
> > > >>>
> > > >>>
> > > >>> Thanks,
> > > >>> Kishore
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> thanks
> > > >> Sumit
> > > >>
> > >
> > >
> >
>

Reply via email to