yes, RM HA has been setup in this cluster. Active : zs-aaa-001.nm.flipkart.com Standby : zs-aaa-002.nm.flipkart.com
RM Link : http://zs-aaa-001.nm.flipkart.com:8088/cluster/scheduler <http://zs-exp-01.nm.flipkart.com:8088/cluster/scheduler> AM Link : http://zs-aaa-001.nm.flipkart.com:8088/proxy/application_1427882795362_0070/slideram <http://zs-exp-01.nm.flipkart.com:8088/proxy/application_1427882795362_0070/slideram> On Wed, Apr 8, 2015 at 1:05 AM, Gour Saha <[email protected]> wrote: > Sorry forgot that the AM link not working was the original issue. > > Few more things - > - Seems like you have RM HA setup, right? > - Can you copy paste the complete link of the RM UI and the URL of > ApplicationMaster (the link which is broken) with actual hostnames? > > > -Gour > > On 4/7/15, 11:43 AM, "Chackravarthy Esakkimuthu" <[email protected]> > wrote: > > >Since 5 containers are running, which means that Storm daemons are already > >up and running? > > > > > >Actually the ApplicationMaster link is not working. It just blanks out > >printing the following : > > > >This is standby RM. Redirecting to the current active RM: > >http://<host-name>:8088/proxy/application_1427882795362_0070/slideram > > > > > >And for resources.json, I dint make any change and used the copy of > >resources-default.json as follows: > > > > > >{ > > > > "schema" : "http://example.org/specification/v2.0.0", > > > > "metadata" : { > > > > }, > > > > "global" : { > > > > "yarn.log.include.patterns": "", > > > > "yarn.log.exclude.patterns": "" > > > > }, > > > > "components": { > > > > "slider-appmaster": { > > > > "yarn.memory": "512" > > > > }, > > > > "NIMBUS": { > > > > "yarn.role.priority": "1", > > > > "yarn.component.instances": "1", > > > > "yarn.memory": "2048" > > > > }, > > > > "STORM_UI_SERVER": { > > > > "yarn.role.priority": "2", > > > > "yarn.component.instances": "1", > > > > "yarn.memory": "1278" > > > > }, > > > > "DRPC_SERVER": { > > > > "yarn.role.priority": "3", > > > > "yarn.component.instances": "1", > > > > "yarn.memory": "1278" > > > > }, > > > > "SUPERVISOR": { > > > > "yarn.role.priority": "4", > > > > "yarn.component.instances": "1", > > > > "yarn.memory": "3072" > > > > } > > > > } > > > >} > > > > > > > >On Tue, Apr 7, 2015 at 11:52 PM, Gour Saha <[email protected]> wrote: > > > >> Chackra sent the attachment directly to me. From what I see the cluster > >> resources (memory and cores) are abundant. > >> > >> But I also see that only 1 app is running which is the one we are trying > >> to debug and 5 containers are running. So definitely more containers > >>that > >> just the AM is running. > >> > >> Can you click on the app master link and copy paste the content of that > >> page? No need for screen shot. Also please send your resources JSON > >>file. > >> > >> -Gour > >> > >> - Sent from my iPhone > >> > >> > On Apr 7, 2015, at 11:01 AM, "Jon Maron" <[email protected]> > >>wrote: > >> > > >> > > >> > On Apr 7, 2015, at 1:36 PM, Chackravarthy Esakkimuthu < > >> [email protected]<mailto:[email protected]>> wrote: > >> > > >> > @Maron, I could not get the logs even though the application is still > >> running. > >> > It's a 10 node cluster and I logged into one of the node and executed > >> the command : > >> > > >> > sudo -u hdfs yarn logs -applicationId application_1427882795362_0070 > >> > 15/04/07 22:56:09 INFO impl.TimelineClientImpl: Timeline service > >> address: http://$HOST:PORT/ws/v1/timeline/ > >> > 15/04/07 22:56:09 INFO client.ConfiguredRMFailoverProxyProvider: > >>Failing > >> over to rm2 > >> > /app-logs/hdfs/logs/application_1427882795362_0070does not have any > >>log > >> files. > >> > > >> > Can you login to the cluster node and look at the logs directory (e.g. > >> in HDP install it would be under /hadoop/yarn/logs IIRC)? > >> > > >> > > >> > > >> > @Gour, Please find the attachment. > >> > > >> > On Tue, Apr 7, 2015 at 10:57 PM, Gour Saha <[email protected] > >> <mailto:[email protected]>> wrote: > >> > Can you take a screenshot of your RM UI and send it over? It is > >>usually > >> > available in a URI similar to > >> http://c6410.ambari.apache.org:8088/cluster. > >> > I am specifically interested in seeing the Cluster Metrics table. > >> > > >> > -Gour > >> > > >> >> On 4/7/15, 10:17 AM, "Jon Maron" <[email protected]<mailto: > >> [email protected]>> wrote: > >> >> > >> >> > >> >>> On Apr 7, 2015, at 1:14 PM, Jon Maron > >><[email protected]<mailto: > >> [email protected]>> wrote: > >> >>> > >> >>> > >> >>>> On Apr 7, 2015, at 1:08 PM, Chackravarthy Esakkimuthu > >> >>>> <[email protected]<mailto:[email protected]>> wrote: > >> >>>> > >> >>>> Thanks for the reply guys! > >> >>>> Contianer allocation happened successfully. > >> >>>> > >> >>>> *RoleStatus{name='slider-appmaster', key=0, minimum=0, maximum=1, > >> >>>> desired=1, actual=1,* > >> >>>> *RoleStatus{name='STORM_UI_SERVER', key=2, minimum=0, maximum=1, > >> >>>> desired=1, > >> >>>> actual=1, * > >> >>>> *RoleStatus{name='NIMBUS', key=1, minimum=0, maximum=1, desired=1, > >> >>>> actual=1, * > >> >>>> *RoleStatus{name='DRPC_SERVER', key=3, minimum=0, maximum=1, > >> desired=1, > >> >>>> actual=1, * > >> >>>> *RoleStatus{name='SUPERVISOR', key=4, minimum=0, maximum=1, > >>desired=1, > >> >>>> actual=1,* > >> >>>> > >> >>>> Also, have put some logs specific to a container.. (nimbus) Same > >>set > >> of > >> >>>> logs available for other Roles also (except Supervisor which has > >>only > >> >>>> first > >> >>>> 2 lines of below logs) > >> >>>> > >> >>>> *Installing NIMBUS on container_e04_1427882795362_0070_01_000002.* > >> >>>> *Starting NIMBUS on container_e04_1427882795362_0070_01_000002.* > >> >>>> *Registering component container_e04_1427882795362_0070_01_000002* > >> >>>> *Requesting applied config for NIMBUS on > >> >>>> container_e04_1427882795362_0070_01_000002.* > >> >>>> *Received and processed config for > >> >>>> container_e04_1427882795362_0070_01_000002___NIMBUS* > >> >>>> > >> >>>> Does this result in any intermediate state? > >> >>>> > >> >>>> @Maron, I didn't configure any port specifically.. do I need to to? > >> >>>> Also, i > >> >>>> don't see any error msg in AM logs wrt port conflict. > >> >>> > >> >>> My only concern was whether you were actually accession the web UIs > >>at > >> >>> the correct host and port. If you are then the next step is > >>probably > >> to > >> >>> look at the actual storm/hbase logs. you can use the ³yarn logs > >> >>> -applicationid ..² command. > >> >> > >> >> *accessing* ;) > >> >> > >> >>> > >> >>>> > >> >>>> Thanks, > >> >>>> Chackra > >> >>>> > >> >>>> > >> >>>> > >> >>>> On Tue, Apr 7, 2015 at 9:02 PM, Jon Maron <[email protected] > >> <mailto:[email protected]>> > >> >>>> wrote: > >> >>>> > >> >>>>> > >> >>>>>> On Apr 7, 2015, at 11:03 AM, Billie Rinaldi > >> >>>>>> <[email protected]<mailto:[email protected]>> > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> One thing you can check is whether your system has enough > >>resources > >> >>>>>> to > >> >>>>>> allocate all the containers the app needs. You will see info > >>like > >> >>>>>> the > >> >>>>>> following in the AM log (it will be logged multiple times over > >>the > >> >>>>>> life > >> >>>>> of > >> >>>>>> the AM). In this case, the master I requested was allocated but > >>the > >> >>>>>> tservers were not. > >> >>>>>> RoleStatus{name='ACCUMULO_TSERVER', key=2, desired=2, actual=0, > >> >>>>>> requested=2, releasing=0, failed=0, started=0, startFailed=0, > >> >>>>> completed=0, > >> >>>>>> failureMessage=''} > >> >>>>>> RoleStatus{name='ACCUMULO_MASTER', key=1, desired=1, actual=1, > >> >>>>> requested=0, > >> >>>>>> releasing=0, failed=0, started=0, startFailed=0, completed=0, > >> >>>>>> failureMessage=Œ'} > >> >>>>> > >> >>>>> You can also check the ³Scheduler² link on the RM Web UI to get a > >> >>>>> sense of > >> >>>>> whether you are resource constrained. > >> >>>>> > >> >>>>> Are you certain that you are attempting to invoke the correct > >>port? > >> >>>>> The > >> >>>>> listening ports are dynamically allocated by Slider. > >> >>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> On Tue, Apr 7, 2015 at 3:29 AM, Chackravarthy Esakkimuthu < > >> >>>>>> [email protected]<mailto:[email protected]>> wrote: > >> >>>>>> > >> >>>>>>> Hi All, > >> >>>>>>> > >> >>>>>>> I am new to Apache slider and would like to contribute. > >> >>>>>>> > >> >>>>>>> Just to start with, I am trying out running "storm" and > >>"hbase" on > >> >>>>>>> yarn > >> >>>>>>> using slider following the guide : > >> >>>>> > >> >>>>> > >> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/run > >> >>>>> ning_applications_on_slider/index.html#Item1.1 > >> >>>>>>> > >> >>>>>>> In both (storm and hbase) the cases, the ApplicationMaster gets > >> >>>>>>> launched > >> >>>>>>> and still running, but the ApplicationMaster link not working, > >>and > >> >>>>>>> from > >> >>>>> AM > >> >>>>>>> logs, I don't see any errors. > >> >>>>>>> > >> >>>>>>> How do I debug from this? Please help me. > >> >>>>>>> Incase if there is any other mail thread with respect this, > >>please > >> >>>>>>> point > >> >>>>>>> out to me. Thanks in advance. > >> >>>>>>> > >> >>>>>>> Thanks, > >> >>>>>>> Chackra > >> > > >> > > >> > > >> > >
