No, iptables is not enabled i think. (will confirm) But, AM is running, even other containers are running and I could see storm/hbase daemons running in those nodes. Does this mean installation is successful? How do I check the status of the installation?
Tried using slider command with no success, (Please let me know if am I using it wrongly) - storm-yarn-1 and hb1 are the names which I used to for "slider create" command. /usr/hdp/current/slider-client/bin/./slider status *storm-yarn-1* 2015-04-08 21:40:17,178 [main] INFO impl.TimelineClientImpl - Timeline service address: http://host2:8188/ws/v1/timeline/ 2015-04-08 21:40:17,782 [main] WARN shortcircuit.DomainSocketFactory - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2015-04-08 21:40:17,936 [main] INFO client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 2015-04-08 21:40:17,970 [main] ERROR main.ServiceLauncher - *Unknown application instance : storm-yarn-1* 2015-04-08 21:40:17,971 [main] INFO util.ExitUtil - Exiting with status 69 /usr/hdp/current/slider-client/bin/./slider status *hb1* 2015-04-08 21:40:31,344 [main] INFO impl.TimelineClientImpl - Timeline service address: http://host2:8188/ws/v1/timeline/ 2015-04-08 21:40:32,075 [main] WARN shortcircuit.DomainSocketFactory - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2015-04-08 21:40:32,263 [main] INFO client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 2015-04-08 21:40:32,306 [main] ERROR main.ServiceLauncher - *Unknown application instance : hb1* 2015-04-08 21:40:32,308 [main] INFO util.ExitUtil - Exiting with status 69 On Wed, Apr 8, 2015 at 7:14 PM, Jon Maron <[email protected]> wrote: > Indications seem to be that the AM is started but the AM URI you’re > attempting to attach to may be mistaken or there may be something > preventing the actual connection. Any chance iptables is enabled? > > > > On Apr 8, 2015, at 3:44 AM, Gour Saha <[email protected]> wrote: > > > > Jon was right. I think Storm uses ${USER_NAME} for app_user instead of > hard coding as yarn unlike hbase. So either users were fine. > > > > One thing I saw in the AM and RM urls is that they link to > zs-aaa-001.nm.flipkart.com and zs-exp-01.nm.flipkart.com. Can you hand > edit the AM URL to try both the host aliases? > > > > I am not sure if the above will work in which case if you could send the > entire AM logs then it would be great. > > > > -Gour > > > > - Sent from my iPhone > > > >> On Apr 7, 2015, at 11:08 PM, "Chackravarthy Esakkimuthu" < > [email protected]> wrote: > >> > >> Tried running with 'yarn' user, but it remains in same state. > >> AM link not working, and AM logs are similar. > >> > >> On Wed, Apr 8, 2015 at 2:14 AM, Gour Saha <[email protected]> > wrote: > >> > >>> In a non-secured cluster you should run as yarn. Can you do that and > let > >>> us know how it goes? > >>> > >>> Also you can stop your existing storm instance in hdfs user (run as > hdfs > >>> user) by running stop first - > >>> slider stop storm1 > >>> > >>> -Gour > >>> > >>> On 4/7/15, 1:39 PM, "Chackravarthy Esakkimuthu" <[email protected] > > > >>> wrote: > >>> > >>>> This is not a secured cluster. > >>>> And yes, I used 'hdfs' user while running slider create. > >>>> > >>>>> On Wed, Apr 8, 2015 at 2:03 AM, Gour Saha <[email protected]> > wrote: > >>>>> > >>>>> Which user are you running the slider create command as? Seems like > you > >>>>> are running as hdfs user. Is this a secured cluster? > >>>>> > >>>>> -Gour > >>>>> > >>>>> On 4/7/15, 1:06 PM, "Chackravarthy Esakkimuthu" < > [email protected]> > >>>>> wrote: > >>>>> > >>>>>> yes, RM HA has been setup in this cluster. > >>>>>> > >>>>>> Active : zs-aaa-001.nm.flipkart.com > >>>>>> Standby : zs-aaa-002.nm.flipkart.com > >>>>>> > >>>>>> RM Link : http://zs-aaa-001.nm.flipkart.com:8088/cluster/scheduler > >>>>>> <http://zs-exp-01.nm.flipkart.com:8088/cluster/scheduler> > >>>>>> > >>>>>> AM Link : > >>> > http://zs-aaa-001.nm.flipkart.com:8088/proxy/application_1427882795362_00 > >>>>> 7 > >>>>>> 0/slideram > >>>>>> < > >>> > http://zs-exp-01.nm.flipkart.com:8088/proxy/application_1427882795362_007 > >>>>>> 0/slideram> > >>>>>> > >>>>>>> On Wed, Apr 8, 2015 at 1:05 AM, Gour Saha <[email protected]> > >>>>>> wrote: > >>>>>> > >>>>>>> Sorry forgot that the AM link not working was the original issue. > >>>>>>> > >>>>>>> Few more things - > >>>>>>> - Seems like you have RM HA setup, right? > >>>>>>> - Can you copy paste the complete link of the RM UI and the URL of > >>>>>>> ApplicationMaster (the link which is broken) with actual hostnames? > >>>>>>> > >>>>>>> > >>>>>>> -Gour > >>>>>>> > >>>>>>> On 4/7/15, 11:43 AM, "Chackravarthy Esakkimuthu" > >>>>> <[email protected] > >>>>>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Since 5 containers are running, which means that Storm daemons are > >>>>>>> already > >>>>>>>> up and running? > >>>>>>>> > >>>>>>>> > >>>>>>>> Actually the ApplicationMaster link is not working. It just blanks > >>>>> out > >>>>>>>> printing the following : > >>>>>>>> > >>>>>>>> This is standby RM. Redirecting to the current active RM: > >>>>>> http:// > <host-name>:8088/proxy/application_1427882795362_0070/slideram > >>>>>>>> > >>>>>>>> > >>>>>>>> And for resources.json, I dint make any change and used the copy > of > >>>>>>>> resources-default.json as follows: > >>>>>>>> > >>>>>>>> > >>>>>>>> { > >>>>>>>> > >>>>>>>> "schema" : "http://example.org/specification/v2.0.0", > >>>>>>>> > >>>>>>>> "metadata" : { > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "global" : { > >>>>>>>> > >>>>>>>> "yarn.log.include.patterns": "", > >>>>>>>> > >>>>>>>> "yarn.log.exclude.patterns": "" > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "components": { > >>>>>>>> > >>>>>>>> "slider-appmaster": { > >>>>>>>> > >>>>>>>> "yarn.memory": "512" > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "NIMBUS": { > >>>>>>>> > >>>>>>>> "yarn.role.priority": "1", > >>>>>>>> > >>>>>>>> "yarn.component.instances": "1", > >>>>>>>> > >>>>>>>> "yarn.memory": "2048" > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "STORM_UI_SERVER": { > >>>>>>>> > >>>>>>>> "yarn.role.priority": "2", > >>>>>>>> > >>>>>>>> "yarn.component.instances": "1", > >>>>>>>> > >>>>>>>> "yarn.memory": "1278" > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "DRPC_SERVER": { > >>>>>>>> > >>>>>>>> "yarn.role.priority": "3", > >>>>>>>> > >>>>>>>> "yarn.component.instances": "1", > >>>>>>>> > >>>>>>>> "yarn.memory": "1278" > >>>>>>>> > >>>>>>>> }, > >>>>>>>> > >>>>>>>> "SUPERVISOR": { > >>>>>>>> > >>>>>>>> "yarn.role.priority": "4", > >>>>>>>> > >>>>>>>> "yarn.component.instances": "1", > >>>>>>>> > >>>>>>>> "yarn.memory": "3072" > >>>>>>>> > >>>>>>>> } > >>>>>>>> > >>>>>>>> } > >>>>>>>> > >>>>>>>> } > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Tue, Apr 7, 2015 at 11:52 PM, Gour Saha < > [email protected]> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Chackra sent the attachment directly to me. From what I see the > >>>>>>> cluster > >>>>>>>>> resources (memory and cores) are abundant. > >>>>>>>>> > >>>>>>>>> But I also see that only 1 app is running which is the one we are > >>>>>>> trying > >>>>>>>>> to debug and 5 containers are running. So definitely more > >>>>> containers > >>>>>>>>> that > >>>>>>>>> just the AM is running. > >>>>>>>>> > >>>>>>>>> Can you click on the app master link and copy paste the content > of > >>>>>>> that > >>>>>>>>> page? No need for screen shot. Also please send your resources > >>>>> JSON > >>>>>>>>> file. > >>>>>>>>> > >>>>>>>>> -Gour > >>>>>>>>> > >>>>>>>>> - Sent from my iPhone > >>>>>>>>> > >>>>>>>>>> On Apr 7, 2015, at 11:01 AM, "Jon Maron" > >>>>> <[email protected]> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> On Apr 7, 2015, at 1:36 PM, Chackravarthy Esakkimuthu < > >>>>>>>>>> [email protected]<mailto:[email protected]>> wrote: > >>>>>>>>>> > >>>>>>>>>> @Maron, I could not get the logs even though the application is > >>>>>>> still > >>>>>>>>> running. > >>>>>>>>>> It's a 10 node cluster and I logged into one of the node and > >>>>>>> executed > >>>>>>>>> the command : > >>>>>>>>>> > >>>>>>>>>> sudo -u hdfs yarn logs -applicationId > >>>>>>> application_1427882795362_0070 > >>>>>>>>>> 15/04/07 22:56:09 INFO impl.TimelineClientImpl: Timeline service > >>>>>>>>> address: http://$HOST:PORT/ws/v1/timeline/ > >>>>>>>>>> 15/04/07 22:56:09 INFO client.ConfiguredRMFailoverProxyProvider: > >>>>>>>>> Failing > >>>>>>>>> over to rm2 > >>>>>>>>>> /app-logs/hdfs/logs/application_1427882795362_0070does not have > >>>>> any > >>>>>>>>> log > >>>>>>>>> files. > >>>>>>>>>> > >>>>>>>>>> Can you login to the cluster node and look at the logs directory > >>>>>>> (e.g. > >>>>>>>>> in HDP install it would be under /hadoop/yarn/logs IIRC)? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> @Gour, Please find the attachment. > >>>>>>>>>> > >>>>>>>>>> On Tue, Apr 7, 2015 at 10:57 PM, Gour Saha > >>>>> <[email protected] > >>>>>>>>> <mailto:[email protected]>> wrote: > >>>>>>>>>> Can you take a screenshot of your RM UI and send it over? It is > >>>>>>>>> usually > >>>>>>>>>> available in a URI similar to > >>>>>>>>> http://c6410.ambari.apache.org:8088/cluster. > >>>>>>>>>> I am specifically interested in seeing the Cluster Metrics > >>>>> table. > >>>>>>>>>> > >>>>>>>>>> -Gour > >>>>>>>>>> > >>>>>>>>>>> On 4/7/15, 10:17 AM, "Jon Maron" > >>>>> <[email protected]<mailto: > >>>>>>>>> [email protected]>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> On Apr 7, 2015, at 1:14 PM, Jon Maron > >>>>>>>>> <[email protected]<mailto: > >>>>>>>>> [email protected]>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> On Apr 7, 2015, at 1:08 PM, Chackravarthy Esakkimuthu > >>>>>>>>>>>>> <[email protected]<mailto:[email protected]>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the reply guys! > >>>>>>>>>>>>> Contianer allocation happened successfully. > >>>>>>>>>>>>> > >>>>>>>>>>>>> *RoleStatus{name='slider-appmaster', key=0, minimum=0, > >>>>>>> maximum=1, > >>>>>>>>>>>>> desired=1, actual=1,* > >>>>>>>>>>>>> *RoleStatus{name='STORM_UI_SERVER', key=2, minimum=0, > >>>>> maximum=1, > >>>>>>>>>>>>> desired=1, > >>>>>>>>>>>>> actual=1, * > >>>>>>>>>>>>> *RoleStatus{name='NIMBUS', key=1, minimum=0, maximum=1, > >>>>>>> desired=1, > >>>>>>>>>>>>> actual=1, * > >>>>>>>>>>>>> *RoleStatus{name='DRPC_SERVER', key=3, minimum=0, maximum=1, > >>>>>>>>> desired=1, > >>>>>>>>>>>>> actual=1, * > >>>>>>>>>>>>> *RoleStatus{name='SUPERVISOR', key=4, minimum=0, maximum=1, > >>>>>>>>> desired=1, > >>>>>>>>>>>>> actual=1,* > >>>>>>>>>>>>> > >>>>>>>>>>>>> Also, have put some logs specific to a container.. (nimbus) > >>>>> Same > >>>>>>>>> set > >>>>>>>>> of > >>>>>>>>>>>>> logs available for other Roles also (except Supervisor which > >>>>> has > >>>>>>>>> only > >>>>>>>>>>>>> first > >>>>>>>>>>>>> 2 lines of below logs) > >>>>>>>>>>>>> > >>>>>>>>>>>>> *Installing NIMBUS on > >>>>>>> container_e04_1427882795362_0070_01_000002.* > >>>>>>>>>>>>> *Starting NIMBUS on > >>>>> container_e04_1427882795362_0070_01_000002.* > >>>>>>>>>>>>> *Registering component > >>>>>>> container_e04_1427882795362_0070_01_000002* > >>>>>>>>>>>>> *Requesting applied config for NIMBUS on > >>>>>>>>>>>>> container_e04_1427882795362_0070_01_000002.* > >>>>>>>>>>>>> *Received and processed config for > >>>>>>>>>>>>> container_e04_1427882795362_0070_01_000002___NIMBUS* > >>>>>>>>>>>>> > >>>>>>>>>>>>> Does this result in any intermediate state? > >>>>>>>>>>>>> > >>>>>>>>>>>>> @Maron, I didn't configure any port specifically.. do I need > >>>>> to > >>>>>>> to? > >>>>>>>>>>>>> Also, i > >>>>>>>>>>>>> don't see any error msg in AM logs wrt port conflict. > >>>>>>>>>>>> > >>>>>>>>>>>> My only concern was whether you were actually accession the > >>>>> web > >>>>>>> UIs > >>>>>>>>> at > >>>>>>>>>>>> the correct host and port. If you are then the next step is > >>>>>>>>> probably > >>>>>>>>> to > >>>>>>>>>>>> look at the actual storm/hbase logs. you can use the ³yarn > >>>>> logs > >>>>>>>>>>>> -applicationid ..² command. > >>>>>>>>>>> > >>>>>>>>>>> *accessing* ;) > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> Chackra > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Apr 7, 2015 at 9:02 PM, Jon Maron > >>>>>>> <[email protected] > >>>>>>>>> <mailto:[email protected]>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Apr 7, 2015, at 11:03 AM, Billie Rinaldi > >>>>>>>>>>>>>>> <[email protected]<mailto:[email protected] > >>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> One thing you can check is whether your system has enough > >>>>>>>>> resources > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> allocate all the containers the app needs. You will see > >>>>> info > >>>>>>>>> like > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> following in the AM log (it will be logged multiple times > >>>>> over > >>>>>>>>> the > >>>>>>>>>>>>>>> life > >>>>>>>>>>>>>> of > >>>>>>>>>>>>>>> the AM). In this case, the master I requested was > >>>>> allocated > >>>>>>> but > >>>>>>>>> the > >>>>>>>>>>>>>>> tservers were not. > >>>>>>>>>>>>>>> RoleStatus{name='ACCUMULO_TSERVER', key=2, desired=2, > >>>>>>> actual=0, > >>>>>>>>>>>>>>> requested=2, releasing=0, failed=0, started=0, > >>>>> startFailed=0, > >>>>>>>>>>>>>> completed=0, > >>>>>>>>>>>>>>> failureMessage=''} > >>>>>>>>>>>>>>> RoleStatus{name='ACCUMULO_MASTER', key=1, desired=1, > >>>>> actual=1, > >>>>>>>>>>>>>> requested=0, > >>>>>>>>>>>>>>> releasing=0, failed=0, started=0, startFailed=0, > >>>>> completed=0, > >>>>>>>>>>>>>>> failureMessage=Œ'} > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> You can also check the ³Scheduler² link on the RM Web UI to > >>>>>>> get a > >>>>>>>>>>>>>> sense of > >>>>>>>>>>>>>> whether you are resource constrained. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Are you certain that you are attempting to invoke the > >>>>> correct > >>>>>>>>> port? > >>>>>>>>>>>>>> The > >>>>>>>>>>>>>> listening ports are dynamically allocated by Slider. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, Apr 7, 2015 at 3:29 AM, Chackravarthy Esakkimuthu < > >>>>>>>>>>>>>>> [email protected]<mailto:[email protected]>> > >>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi All, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I am new to Apache slider and would like to contribute. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Just to start with, I am trying out running "storm" and > >>>>>>>>> "hbase" on > >>>>>>>>>>>>>>>> yarn > >>>>>>>>>>>>>>>> using slider following the guide : > >>>>> > >>> > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/run > >>>>>>>>>>>>>> ning_applications_on_slider/index.html#Item1.1 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> In both (storm and hbase) the cases, the ApplicationMaster > >>>>>>> gets > >>>>>>>>>>>>>>>> launched > >>>>>>>>>>>>>>>> and still running, but the ApplicationMaster link not > >>>>>>> working, > >>>>>>>>> and > >>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>> AM > >>>>>>>>>>>>>>>> logs, I don't see any errors. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> How do I debug from this? Please help me. > >>>>>>>>>>>>>>>> Incase if there is any other mail thread with respect > >>>>> this, > >>>>>>>>> please > >>>>>>>>>>>>>>>> point > >>>>>>>>>>>>>>>> out to me. Thanks in advance. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>> Chackra > >>> > >>> > >
