sure, will send you the logs. And the same pattern follows for hbase installation also. 'stop' command stops only SliderAM. 'destroy' command stops HMaster and RegionServer only.. HBASE_REST and THRIFT_2 still running after destroy command, And slider agents running in all 4 hosts where container was launched.
On Tue, Apr 28, 2015 at 3:00 AM, Gour Saha <[email protected]> wrote: > To dig deeper, I would need to get hold of the Slider AM log (slider.log) > and at least one of the agent logs (slider-agent.log) for Nimbus say. > > They will be under - > /hadoop/yarn/log/<app_id>/<container_id>/ > > OR you can run - > yarn logs -applicationId <app_id> > and dump it in a file, if the <app_id> directory under /hadoop/yarn/log is > missing. > > > Also if you could provide the Node Manager logs it would help. It is under > - > /var/log/hadoop-yarn/yarn/ > > and file name of the format - yarn-yarn-nodemanager-<hostname>.log > > -Gour > > On 4/27/15, 1:32 PM, "Chackravarthy Esakkimuthu" <[email protected]> > wrote: > > >Run “slider start storm1” again, it should create > >application_1428575950531_0014 > >(with id 0014). > > ---> yes it does > > > >After that can you check if the processes from > >application_1428575950531_0013 are still running? > > ---> yes > > > >If yes, then run “slider stop storm1” again and then do you see processes > >from > >both application_1428575950531_0013 and application_1428575950531_0014 > >running? > > ---> yes both are running and able to access both storm UI's also. > >(only > >SliderAM was stopped) > > > >On Tue, Apr 28, 2015 at 1:54 AM, Gour Saha <[email protected]> wrote: > > > >> Yes, those processes correspond to slider agent. > >> > >> Based on the issue you are facing let’s do this - > >> > >> Run “slider start storm1” again, it should create > >> application_1428575950531_0014 (with id 0014). After that can you check > >>if > >> the processes from application_1428575950531_0013 are still running? If > >> yes, then run “slider stop storm1” again and then do you see processes > >> from both application_1428575950531_0013 and > >> application_1428575950531_0014 running? > >> > >> -Gour > >> > >> On 4/27/15, 1:11 PM, "Chackravarthy Esakkimuthu" <[email protected] > > > >> wrote: > >> > >> >And how do we confirm that slider agents are stopped in each node where > >> >the > >> >container is allocated? > >> >because even after stop command and even destroy command, I could see > >> >agents seems to be running in all those nodes. > >> > > >> >yarn 47909 47907 0 00:37 ? 00:00:00 /bin/bash -c python > >> >./infra/agent/slider-agent/agent/main.py --label > >> >container_1428575950531_0013_01_000002___NIMBUS --zk-quorum > >> >host1:2181,host2:2181,host3:2181 --zk-reg-path > >> >/registry/users/yarn/services/org-apache-slider/storm1 > > >> > >>>/var/log/hadoop-yarn/application_1428575950531_0013/container_1428575950 > >>>53 > >> >1_0013_01_000002/slider-agent.out > >> >2>&1 > >> >yarn 47915 47909 0 00:37 ? 00:00:02 python > >> >./infra/agent/slider-agent/agent/main.py --label > >> >container_1428575950531_0013_01_000002___NIMBUS --zk-quorum > >> >host1:2181,host2:2181,host3:2181 --zk-reg-path > >> >/registry/users/yarn/services/org-apache-slider/storm1 > >> > > >> >Doesn't these processes correspond to slider agent? > >> > > >> >On Tue, Apr 28, 2015 at 1:32 AM, Chackravarthy Esakkimuthu < > >> >[email protected]> wrote: > >> > > >> >> 1) slider create storm1 > >> >> --- it started all the components, SliderAM, slider agents. And > >>storm UI > >> >> was accessible. Also manually logged into each host and verified all > >> >> components are up and running. > >> >> > >> >> 2) slider stop storm1 > >> >> --- it stopped SliderAM > >> >> --- but all the components are running along with slider agents. And > >> >>storm > >> >> UI was accessible. > >> >> > >> >> 3) slider start storm1 (RM UI was less responsive during this time) > >> >> --- it started another sliderAM and other set of storm components and > >> >> slider agents also. And able to access storm UI in another host. > >> >> > >> >> So now, actually two storm cluster is running though I used same name > >> >> "storm1" > >> >> > >> >> On Tue, Apr 28, 2015 at 1:23 AM, Gour Saha <[email protected]> > >> >>wrote: > >> >> > >> >>> Hmm.. Interesting. > >> >>> > >> >>> Is it possible to run "ps -ef | grep storm" before and after the > >>storm1 > >> >>> app is started and send the output? > >> >>> > >> >>> -Gour > >> >>> > >> >>> On 4/27/15, 12:48 PM, "Chackravarthy Esakkimuthu" > >> >>><[email protected]> > >> >>> wrote: > >> >>> > >> >>> >No, the processes are not old one, because it shows the class path > >> >>>which > >> >>> >has folder names corresponds to newly launched application id. > >>(also > >> >>> every > >> >>> >time before launching new application, I made sure that all > >>processes > >> >>>are > >> >>> >killed) > >> >>> > > >> >>> >And the output of list command as follows : > >> >>> > > >> >>> >sudo -u yarn /usr/hdp/current/slider-client/bin/./slider list > >> >>> >2015-04-28 01:14:24,568 [main] INFO impl.TimelineClientImpl - > >> >>>Timeline > >> >>> >service address: http://host2:8188/ws/v1/timeline/ > >> >>> >2015-04-28 01:14:25,669 [main] INFO client.RMProxy - Connecting to > >> >>> >ResourceManager at host2/XX.XX.XX.XX:8050 > >> >>> >storm1 FINISHED > >> >>> application_1428575950531_0013 > >> >>> > > >> >>> >2015-04-28 01:14:26,108 [main] INFO util.ExitUtil - Exiting with > >> >>>status > >> >>> 0 > >> >>> > > >> >>> >On Tue, Apr 28, 2015 at 1:01 AM, Gour Saha <[email protected]> > >> >>> wrote: > >> >>> > > >> >>> >> Sorry, forgot that --containers is supported in develop branch > >>only. > >> >>> >>Just > >> >>> >> run list without that option. > >> >>> >> > >> >>> >> Seems like the running processes are stray processes from old > >> >>> >>experimental > >> >>> >> runs. Can you check the date/time of these processes? > >> >>> >> > >> >>> >> If you bring the storm instance up again, do you see new > >>instances > >> >>>of > >> >>> >> nimbus, supervisor, etc. getting created? The old stray ones will > >> >>> >>probably > >> >>> >> still be there. > >> >>> >> > >> >>> >> Also, can you run just “slider list” (no other params) and send > >>the > >> >>> >>output? > >> >>> >> > >> >>> >> -Gour > >> >>> >> > >> >>> >> On 4/27/15, 12:20 PM, "Chackravarthy Esakkimuthu" > >> >>> >><[email protected]> > >> >>> >> wrote: > >> >>> >> > >> >>> >> >There is some issue in that command usage (i tried giving the > >> >>>params > >> >>> in > >> >>> >> >the > >> >>> >> >the order also) > >> >>> >> > > >> >>> >> >sudo -u yarn /usr/hdp/current/slider-client/bin/./slider list > >> >>>storm1 > >> >>> >> >--containers > >> >>> >> > > >> >>> >> >2015-04-28 00:42:01,017 [main] ERROR main.ServiceLauncher - > >> >>> >> >com.beust.jcommander.ParameterException: Unknown option: > >> >>>--containers > >> >>> >>in > >> >>> >> >list storm1 --containers > >> >>> >> > > >> >>> >> >2015-04-28 00:42:01,021 [main] INFO util.ExitUtil - Exiting > >>with > >> >>> >>status > >> >>> >> >40 > >> >>> >> > > >> >>> >> >Anyway, I issued STOP command and checked in the RM UI, the > >> >>> >>application is > >> >>> >> >stopped and all the 5 containers are released.. It shows as ZERO > >> >>> >> >containers > >> >>> >> >is running. > >> >>> >> > > >> >>> >> >But, when I login to that machine, I could see storm components > >>are > >> >>> >>still > >> >>> >> >running there (ps -ef | grep storm). The processes are up. Even > >> >>>Storm > >> >>> >>UI > >> >>> >> >is > >> >>> >> >still accessible. > >> >>> >> > > >> >>> >> > > >> >>> >> > > >> >>> >> >On Tue, Apr 28, 2015 at 12:29 AM, Gour Saha > >><[email protected] > >> > > >> >>> >> wrote: > >> >>> >> > > >> >>> >> >> Calling ³slider stop² before ³slider destroy² is the right > >>order. > >> >>> >> >> > >> >>> >> >> On calling stop, your storm cluster should be completely > >>stopped > >> >>> >> >> (including Slider AM and all storm components). > >> >>> >> >> > >> >>> >> >> Can you run this command after stop and send the output (don¹t > >> >>>run > >> >>> >> >>destroy > >> >>> >> >> yet)? > >> >>> >> >> > >> >>> >> >> slider list <app-instance-name> --containers > >> >>> >> >> > >> >>> >> >> Also, at this point you should check the RM UI and it should > >>show > >> >>> >>that > >> >>> >> >>the > >> >>> >> >> yarn app is in stopped state. > >> >>> >> >> > >> >>> >> >> -Gour > >> >>> >> >> > >> >>> >> >> On 4/27/15, 11:52 AM, "Chackravarthy Esakkimuthu" > >> >>> >> >><[email protected]> > >> >>> >> >> wrote: > >> >>> >> >> > >> >>> >> >> >I started the storm on yarn (slider create) > >> >>> >> >> >Then wanted to test whether destroying the storm works or > >>not. > >> >>> >> >> >So I tried in the following order : > >> >>> >> >> > > >> >>> >> >> >1) slider stop <app-instance-name> > >> >>> >> >> >-- in this case, sliderAM alone stopped, and all the other > >>storm > >> >>> >> >>daemons > >> >>> >> >> >like Nimbus, supervisor, log_viewer, drpc, UI_Server was > >> >>>running. > >> >>> >> >>(along > >> >>> >> >> >with slider agents) > >> >>> >> >> > > >> >>> >> >> >Is this just an intermediate state before issuing destroy > >> >>>command? > >> >>> >> >> > > >> >>> >> >> >2) slider destroy <app-instance-name> > >> >>> >> >> >-- in this case, only nimbus and supervisor got killed. The > >> >>>other > >> >>> >>storm > >> >>> >> >> >daemons (log_viewer, drpc, UI_Server) still running. And > >>slider > >> >>> >>agents > >> >>> >> >> >too > >> >>> >> >> >still running in all the 4 containers. > >> >>> >> >> > > >> >>> >> >> >This issue I face in 0.60 release. Then I tried with 0.71 > >> >>>release. > >> >>> >>But > >> >>> >> >> >still same behaviour exists. > >> >>> >> >> > > >> >>> >> >> >Am I using the command in wrong way (or some other order) ? > >>or > >> >>> issue > >> >>> >> >> >exists. > >> >>> >> >> > > >> >>> >> >> >Thanks in advance! > >> >>> >> >> > > >> >>> >> >> > > >> >>> >> >> >Thanks, > >> >>> >> >> >Chackra > >> >>> >> >> > >> >>> >> >> > >> >>> >> > >> >>> >> > >> >>> > >> >>> > >> >> > >> > >> > >
