And how do we confirm that slider agents are stopped in each node where the
container is allocated?
because even after stop command and even destroy command, I could see
agents seems to be running in all those nodes.

yarn     47909 47907  0 00:37 ?        00:00:00 /bin/bash -c python
./infra/agent/slider-agent/agent/main.py --label
container_1428575950531_0013_01_000002___NIMBUS --zk-quorum
host1:2181,host2:2181,host3:2181 --zk-reg-path
/registry/users/yarn/services/org-apache-slider/storm1 >
/var/log/hadoop-yarn/application_1428575950531_0013/container_1428575950531_0013_01_000002/slider-agent.out
2>&1
yarn     47915 47909  0 00:37 ?        00:00:02 python
./infra/agent/slider-agent/agent/main.py --label
container_1428575950531_0013_01_000002___NIMBUS --zk-quorum
host1:2181,host2:2181,host3:2181 --zk-reg-path
/registry/users/yarn/services/org-apache-slider/storm1

Doesn't these processes correspond to slider agent?

On Tue, Apr 28, 2015 at 1:32 AM, Chackravarthy Esakkimuthu <
[email protected]> wrote:

> 1) slider create storm1
> --- it started all the components, SliderAM, slider agents. And storm UI
> was accessible. Also manually logged into each host and verified all
> components are up and running.
>
> 2) slider stop storm1
> --- it stopped SliderAM
> --- but all the components are running along with slider agents. And storm
> UI was accessible.
>
> 3) slider start storm1 (RM UI was less responsive during this time)
> --- it started another sliderAM and other set of storm components and
> slider agents also. And able to access storm UI in another host.
>
> So now, actually two storm cluster is running though I used same name
> "storm1"
>
> On Tue, Apr 28, 2015 at 1:23 AM, Gour Saha <[email protected]> wrote:
>
>> Hmm.. Interesting.
>>
>> Is it possible to run "ps -ef | grep storm" before and after the storm1
>> app is started and send the output?
>>
>> -Gour
>>
>> On 4/27/15, 12:48 PM, "Chackravarthy Esakkimuthu" <[email protected]>
>> wrote:
>>
>> >No, the processes are not old one, because it shows the class path which
>> >has folder names corresponds to newly launched application id. (also
>> every
>> >time before launching new application, I made sure that all processes are
>> >killed)
>> >
>> >And the output of list command as follows :
>> >
>> >sudo -u yarn /usr/hdp/current/slider-client/bin/./slider list
>> >2015-04-28 01:14:24,568 [main] INFO  impl.TimelineClientImpl - Timeline
>> >service address: http://host2:8188/ws/v1/timeline/
>> >2015-04-28 01:14:25,669 [main] INFO  client.RMProxy - Connecting to
>> >ResourceManager at host2/XX.XX.XX.XX:8050
>> >storm1                            FINISHED
>> application_1428575950531_0013
>> >
>> >2015-04-28 01:14:26,108 [main] INFO  util.ExitUtil - Exiting with status
>> 0
>> >
>> >On Tue, Apr 28, 2015 at 1:01 AM, Gour Saha <[email protected]>
>> wrote:
>> >
>> >> Sorry, forgot that --containers is supported in develop branch only.
>> >>Just
>> >> run list without that option.
>> >>
>> >> Seems like the running processes are stray processes from old
>> >>experimental
>> >> runs. Can you check the date/time of these processes?
>> >>
>> >> If you bring the storm instance up again, do you see new instances of
>> >> nimbus, supervisor, etc. getting created? The old stray ones will
>> >>probably
>> >> still be there.
>> >>
>> >> Also, can you run just “slider list” (no other params) and send the
>> >>output?
>> >>
>> >> -Gour
>> >>
>> >> On 4/27/15, 12:20 PM, "Chackravarthy Esakkimuthu"
>> >><[email protected]>
>> >> wrote:
>> >>
>> >> >There is some issue in that command usage (i tried giving the params
>> in
>> >> >the
>> >> >the order also)
>> >> >
>> >> >sudo -u yarn /usr/hdp/current/slider-client/bin/./slider list storm1
>> >> >--containers
>> >> >
>> >> >2015-04-28 00:42:01,017 [main] ERROR main.ServiceLauncher -
>> >> >com.beust.jcommander.ParameterException: Unknown option: --containers
>> >>in
>> >> >list storm1 --containers
>> >> >
>> >> >2015-04-28 00:42:01,021 [main] INFO  util.ExitUtil - Exiting with
>> >>status
>> >> >40
>> >> >
>> >> >Anyway, I issued STOP command and checked in the RM UI, the
>> >>application is
>> >> >stopped and all the 5 containers are released.. It shows as ZERO
>> >> >containers
>> >> >is running.
>> >> >
>> >> >But, when I login to that machine, I could see storm components are
>> >>still
>> >> >running there (ps -ef | grep storm). The processes are up. Even Storm
>> >>UI
>> >> >is
>> >> >still accessible.
>> >> >
>> >> >
>> >> >
>> >> >On Tue, Apr 28, 2015 at 12:29 AM, Gour Saha <[email protected]>
>> >> wrote:
>> >> >
>> >> >> Calling ³slider stop² before ³slider destroy² is the right order.
>> >> >>
>> >> >> On calling stop, your storm cluster should be completely stopped
>> >> >> (including Slider AM and all storm components).
>> >> >>
>> >> >> Can you run this command after stop and send the output (don¹t run
>> >> >>destroy
>> >> >> yet)?
>> >> >>
>> >> >> slider list <app-instance-name> --containers
>> >> >>
>> >> >> Also, at this point you should check the RM UI and it should show
>> >>that
>> >> >>the
>> >> >> yarn app is in stopped state.
>> >> >>
>> >> >> -Gour
>> >> >>
>> >> >> On 4/27/15, 11:52 AM, "Chackravarthy Esakkimuthu"
>> >> >><[email protected]>
>> >> >> wrote:
>> >> >>
>> >> >> >I started the storm on yarn (slider create)
>> >> >> >Then wanted to test whether destroying the storm works or not.
>> >> >> >So I tried in the following order :
>> >> >> >
>> >> >> >1) slider stop <app-instance-name>
>> >> >> >-- in this case, sliderAM alone stopped, and all the other storm
>> >> >>daemons
>> >> >> >like Nimbus, supervisor, log_viewer,  drpc, UI_Server was running.
>> >> >>(along
>> >> >> >with slider agents)
>> >> >> >
>> >> >> >Is this just an intermediate state before issuing destroy command?
>> >> >> >
>> >> >> >2) slider destroy <app-instance-name>
>> >> >> >-- in this case, only nimbus and supervisor got killed. The other
>> >>storm
>> >> >> >daemons (log_viewer,  drpc, UI_Server) still running. And slider
>> >>agents
>> >> >> >too
>> >> >> >still running in all the 4 containers.
>> >> >> >
>> >> >> >This issue I face in 0.60 release. Then I tried with 0.71 release.
>> >>But
>> >> >> >still same behaviour exists.
>> >> >> >
>> >> >> >Am I using the command in wrong way (or some other order) ? or
>> issue
>> >> >> >exists.
>> >> >> >
>> >> >> >Thanks in advance!
>> >> >> >
>> >> >> >
>> >> >> >Thanks,
>> >> >> >Chackra
>> >> >>
>> >> >>
>> >>
>> >>
>>
>>
>

Reply via email to