Hi Robert I figured out the problem just now. To avoid the below error, I had to set the 'hadoop.http.staticuser.user' property in core-site.xml (defaults to dr.who). I can now get runtime data from AppMaster using *curl* as well as in GUI.
I wonder if we have to set this property even when we are not specifying the yarn web-proxy address (when it runs as part of RM by default) as well. If yes, was it documented somewhere which I failed to see? :( Anyways, thanks for your response so far. Regards, Prajakta On Mon, Jul 9, 2012 at 3:29 PM, Prajakta Kalmegh <pkalm...@gmail.com> wrote: > Hi Robert > > I started the proxyserver explicitly by specifying a value for the > yarn.web-proxy.address in yarn-site.xml. The proxyserver did start and I > tried getting the JSON response using the following command : > > curl --compressed -H "Accept: application/json" -X GET " > http://localhost:8090/proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001 > " > > However, it refused connection and below is the excerpt from the > Proxyserver logs: > --------- > 2012-07-09 14:26:40,402 INFO org.mortbay.log: Extract > jar:file:/home/prajakta/Projects/IRL/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-yarn-common-3.0.0-SNAPSHOT.jar!/webapps/proxy > to /tmp/Jetty_localhost_8090_proxy____.ak3o30/webapp > 2012-07-09 14:26:40,992 INFO org.mortbay.log: Started > SelectChannelConnector@localhost:8090 > 2012-07-09 14:26:40,993 INFO > org.apache.hadoop.yarn.service.AbstractService: > Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxy is started. > 2012-07-09 14:26:40,993 INFO > org.apache.hadoop.yarn.service.AbstractService: > Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer is started. > 2012-07-09 14:33:26,039 INFO > org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is > accessing unchecked > http://prajakta:44314/ws/v1/mapreduce/jobs/job_1341823967331_0001 which > is the app master GUI of application_1341823967331_0001 owned by prajakta > 2012-07-09 14:33:29,277 INFO > org.apache.commons.httpclient.HttpMethodDirector: I/O exception > (org.apache.commons.httpclient.NoHttpResponseException) caught when > processing request: The server prajakta failed to respond > 2012-07-09 14:33:29,277 INFO > org.apache.commons.httpclient.HttpMethodDirector: Retrying request > 2012-07-09 14:33:29,284 WARN org.mortbay.log: > /proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001: > java.net.SocketException: Connection reset > 2012-07-09 14:37:33,834 INFO > org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is > accessing unchecked > http://prajakta:19888/jobhistory/job/job_1341823967331_0001/jobhistory/job/job_1341823967331_0001which > is the app master GUI of application_1341823967331_0001 owned by > prajakta > --------------- > > I am not sure why http request object is setting my remoteUser to dr.who. > :( > > I gather from <https://issues.apache.org/jira/browse/MAPREDUCE-2858> that > this warning is posted only in case where security is disabled. I assume > that the proxy server is not disabled if security is disabled. > > Any idea what could be the reason for this I/O exception? Am I missing > setting any property for proper access. Please let me know. > > Regards, > Prajakta > > > > > > > On Fri, Jul 6, 2012 at 10:59 PM, Prajakta Kalmegh <pkalm...@gmail.com>wrote: > >> I am using hadoop trunk (forked from github). It supports RESTful APIs as >> I am able to retrieve JSON objects for RM (cluster/nodes info)+ >> Historyserver. The only issue is with AppMaster REST API. >> >> Regards, >> Prajakta >> >> >> >> On Fri, Jul 6, 2012 at 10:55 PM, Robert Evans <ev...@yahoo-inc.com>wrote: >> >>> What version of hadoop are you using? It could be that the version you >>> have does not have the RESTful APIs in it yet, and the proxy is working >>> just fine. >>> >>> --Bobby Evans >>> >>> On 7/6/12 12:06 PM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote: >>> >>> >Robert , Thanks for the response. If I do not provide any explicit >>> >configuration for the proxy server, do I still need to start it using >>> the >>> >'yarn start proxy server'? I am currently not doing it. >>> > >>> >Also, I am able to access the html page for proxy using the >>> ><http://localhost:8088/proxy/{appid}/mapreduce/jobs> URL. (Note this >>> url >>> >does not have the '/ws/v1/ part in it. I get the html response when I >>> >query >>> >for this URL in runtime. >>> > >>> >So I assume the proxy server must be starting fine since I am able to >>> >access this URL. I will try logging more details tomorrow from my office >>> >machine and will let you know the result. >>> > >>> >Regards, >>> >Prajakta >>> > >>> > >>> > >>> >On Fri, Jul 6, 2012 at 10:22 PM, Robert Evans <ev...@yahoo-inc.com> >>> wrote: >>> > >>> >> Sorry I did not respond sooner. The default behavior is to have the >>> >>proxy >>> >> server run as part of the RM. I am not really sure why it is not >>> doing >>> >> this in your case. If you set the config yourself to be a URI that is >>> >> different from that of the RM then you need to launch a standalone >>> proxy >>> >> server. You can do this by running >>> >> >>> >> yarn start proxy server >>> >> >>> >> Without sitting down with you it is going to be somewhat difficult to >>> >> debug why this is happening. However, in retrospect it would be nice >>> to >>> >> add in some extra logging to help indicate why the proxy server is not >>> >> functioning as desired. If you could file a JIRA to add in the >>> logging >>> >>I >>> >> would be happy to provide a patch to you and we can try and debug the >>> >> issue further. Please file it under the MAPREDUCE JIRA project. >>> >> >>> >> --Bobby >>> >> >>> >> On 7/6/12 3:29 AM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote: >>> >> >>> >> >Re-posting as I haven't got a solution yet. Sorry for spamming. I >>> >>won't be >>> >> >able to proceed in my code until I get a JSON response using >>> AppMaster >>> >> >REST >>> >> >URL. :( >>> >> > >>> >> >Thanks, >>> >> >Prajakta >>> >> > >>> >> > >>> >> >On Wed, Jul 4, 2012 at 5:55 PM, Prajakta Kalmegh <pkalm...@gmail.com >>> > >>> >> >wrote: >>> >> > >>> >> >> Hi Robert/Harsh >>> >> >> >>> >> >> Thanks for your reply. >>> >> >> >>> >> >> My RM is starting just fine. The problem is with the use of >>> >> >>http://<proxy httpddress:port>/proxy/{appid}/ws/v1/mapreduce >>> >> >> to get the JSON response. >>> >> >> >>> >> >> As I said before, I had not configured the yarn.web-proxy.address >>> >> >>property in yarn-site.xml. I assumed it will use the RM's >>> >> >>yarn.resourcemanager.webapp.address property value as default. >>> >>However, >>> >> >>it gives me a '404-Page not found error'. Today I tried specifying >>> a >>> >> >>value explicitly for the yarn.web-proxy.address property. >>> >> >> >>> >> >> On running the wordcount example, it even gives a url >>> >> >><http://localhost:8090>/proxy/{appid}/> to track the App Mast info. >>> >> >>However, I am still not able to get a json response. >>> >> >> >>> >> >> Also, I tried to get the data from historyserver instead of runtime >>> >> >>using the instructions given on page >>> >> >>< >>> >> >>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar >>> >> >>n-site/HistoryServerRest.html> >>> >> >> >>> >> >> HistoryServer REST response does not give me jobids corresponding >>> to >>> >>an >>> >> >>application. It just lists all the jobs run until now. By the way, >>> the >>> >> >>documentation does say >>> >> >> >>> >> >> ---------- >>> >> >> >>> >> >> "Both of the following URI's give you the history server >>> information, >>> >> >>from an application id identified by the appid value. >>> >> >> * http://<history server http address:port>/ws/v1/history >>> >> >> * http://<history server http address:port>/ws/v1/history/info" >>> >> >> --------- >>> >> >> >>> >> >> But there is no provision to specify the application id with these >>> >>REST >>> >> >>URLs. >>> >> >> >>> >> >> Any idea how I can get the Application Master REST working and also >>> >> >>linking jobids to application id using the HistoryServerREST API? >>> >> >> >>> >> >> Any help is appreciated. Thanks in advance. >>> >> >> Regards, >>> >> >> Prajakta >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> On Fri, Jun 29, 2012 at 8:55 PM, Robert Evans <ev...@yahoo-inc.com >>> > >>> >> >>wrote: >>> >> >> >>> >> >>> Please don't file that JIRA. The proxy server is intended to >>> front >>> >>the >>> >> >>> web server for all calls to the AM. This is so you only have to >>> go >>> >>to >>> >> >>>a >>> >> >>> single location to get to any AM's web service. The proxy server >>> >>is a >>> >> >>> very simple proxy and just forwards the extra part of the path on >>> to >>> >> >>>the >>> >> >>> AM. >>> >> >>> >>> >> >>> If you are having issues with this please include the version you >>> >>are >>> >> >>> having problems with. Also please look at the logs for the RM on >>> >> >>>startup >>> >> >>> to see if there is anything there indicating why it is not >>> starting >>> >>up. >>> >> >>> >>> >> >>> --Bobby Evans >>> >> >>> >>> >> >>> On 6/28/12 9:46 AM, "Harsh J" <ha...@cloudera.com> wrote: >>> >> >>> >>> >> >>> >As far as I can tell, the MR WebApp, as the name itself indicates >>> >>on >>> >> >>> >its doc page, starts only at the MR AM (which may be running at >>> any >>> >> >>> >NM), and it starts as an ephemeral port logged at in the AM logs >>> >> >>> >usually as: >>> >> >>> > >>> >> >>> >INFO Web app /mapreduce started at [PORT] >>> >> >>> > >>> >> >>> >That it starts its own server with an ephemeral access point >>> makes >>> >> >>> >sense, since each job uses its own AM and having a common >>> location >>> >>may >>> >> >>> >not work with the form of REST API documented at your link. Can >>> you >>> >> >>> >please file a JIRA to fix the doc and remove the proxy server >>> refs, >>> >> >>> >which are misleading? >>> >> >>> > >>> >> >>> >Do correct me if I'm wrong. >>> >> >>> > >>> >> >>> >On Thu, Jun 28, 2012 at 6:13 PM, Prajakta Kalmegh >>> >><pkalm...@gmail.com >>> >> > >>> >> >>> >wrote: >>> >> >>> >> Hi >>> >> >>> >> >>> >> >>> >> I am trying to get the ApplicationMaster info using the >>> >> >>><http://<proxy >>> >> >>> >>http >>> >> >>> >> address:port>/proxy/{appid}/ws/v1/mapreduce/info> link as >>> >>described >>> >> >>>on >>> >> >>> >>the < >>> >> >>> >> >>> >> >>> >> >>> >> >>> >>> >> >>> >>> >> >>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar >>> >> >>>n >>> >> >>> >>-site/MapredAppMasterRest.html> >>> >> >>> >> page. >>> >> >>> >> >>> >> >>> >> I am able to access and retrieve JSON response for other >>> modules >>> >> >>> >> (ResourceManager, NodeManager and HistoryServer). However, I am >>> >> >>>getting >>> >> >>> >> 'Page not found' when I try to use my ResourceManager Http >>> >>address >>> >> >>>to >>> >> >>> >> access the ApplicationMaster info. I am using < >>> >> >>> >> http://localhost:8088/proxy/{appid}/ws/v1/mapreduce/info> to >>> >> >>>retrieve >>> >> >>> >>JSON >>> >> >>> >> response. >>> >> >>> >> >>> >> >>> >> The instructions say "The application master should be accessed >>> >>via >>> >> >>>the >>> >> >>> >> proxy. This proxy is configurable to run either on the resource >>> >> >>>manager >>> >> >>> >>or >>> >> >>> >> on a separate host." >>> >> >>> >> >>> >> >>> >> My yarn-default.xml contains: >>> >> >>> >> <property> >>> >> >>> >> <description>The address for the web proxy as HOST:PORT, if >>> >>this >>> >> >>>is >>> >> >>> >>not >>> >> >>> >> given then the proxy will run as part of the >>> RM</description> >>> >> >>> >> <name>yarn.web-proxy.address</name> >>> >> >>> >> <value/> >>> >> >>> >> </property> >>> >> >>> >> >>> >> >>> >> and I did not set a value explicitly in yarn-site.xml. Any >>> idea >>> >> >>>how I >>> >> >>> >>can >>> >> >>> >> get this working? Thanks in advance. >>> >> >>> >> >>> >> >>> >> Regards, >>> >> >>> >> Prajakta >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> >-- >>> >> >>> >Harsh J >>> >> >>> >>> >> >>> >>> >> >> >>> >> >>> >> >>> >>> >> >