Hi Robert

I figured out the problem just now. To avoid the below error, I had to set
the 'hadoop.http.staticuser.user' property in core-site.xml (defaults to
dr.who). I can now get runtime data from AppMaster using *curl* as well as
in GUI.

I wonder if we have to set this property even when we are not specifying
the yarn web-proxy address (when it runs as part of RM by default) as well.
If yes, was it documented somewhere which I failed to see? :(

Anyways, thanks for your response so far.

Regards,
Prajakta



On Mon, Jul 9, 2012 at 3:29 PM, Prajakta Kalmegh <pkalm...@gmail.com> wrote:

> Hi Robert
>
> I started the proxyserver explicitly by specifying a value for the
> yarn.web-proxy.address in yarn-site.xml. The proxyserver did start and I
> tried getting the JSON response using the following command :
>
> curl --compressed -H "Accept: application/json" -X GET "
> http://localhost:8090/proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001
> "
>
> However, it refused connection and below is the excerpt from the
> Proxyserver logs:
> ---------
> 2012-07-09 14:26:40,402 INFO org.mortbay.log: Extract
> jar:file:/home/prajakta/Projects/IRL/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-yarn-common-3.0.0-SNAPSHOT.jar!/webapps/proxy
> to /tmp/Jetty_localhost_8090_proxy____.ak3o30/webapp
> 2012-07-09 14:26:40,992 INFO org.mortbay.log: Started
> SelectChannelConnector@localhost:8090
> 2012-07-09 14:26:40,993 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxy is started.
> 2012-07-09 14:26:40,993 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer is started.
> 2012-07-09 14:33:26,039 INFO
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is
> accessing unchecked
> http://prajakta:44314/ws/v1/mapreduce/jobs/job_1341823967331_0001 which
> is the app master GUI of application_1341823967331_0001 owned by prajakta
> 2012-07-09 14:33:29,277 INFO
> org.apache.commons.httpclient.HttpMethodDirector: I/O exception
> (org.apache.commons.httpclient.NoHttpResponseException) caught when
> processing request: The server prajakta failed to respond
> 2012-07-09 14:33:29,277 INFO
> org.apache.commons.httpclient.HttpMethodDirector: Retrying request
> 2012-07-09 14:33:29,284 WARN org.mortbay.log:
> /proxy/application_1341823967331_0001/ws/v1/mapreduce/jobs/job_1341823967331_0001:
> java.net.SocketException: Connection reset
> 2012-07-09 14:37:33,834 INFO
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is
> accessing unchecked
> http://prajakta:19888/jobhistory/job/job_1341823967331_0001/jobhistory/job/job_1341823967331_0001which
>  is the app master GUI of application_1341823967331_0001 owned by
> prajakta
> ---------------
>
> I am not sure why http request object is setting my remoteUser to dr.who.
> :(
>
> I gather from <https://issues.apache.org/jira/browse/MAPREDUCE-2858> that
> this warning is posted only in case where security is disabled. I assume
> that the proxy server is not disabled if security is disabled.
>
> Any idea what could be the reason for this I/O exception? Am I missing
> setting any property for proper access. Please let me know.
>
> Regards,
> Prajakta
>
>
>
>
>
>
> On Fri, Jul 6, 2012 at 10:59 PM, Prajakta Kalmegh <pkalm...@gmail.com>wrote:
>
>> I am using hadoop trunk (forked from github). It supports RESTful APIs as
>> I am able to retrieve JSON objects for RM (cluster/nodes info)+
>> Historyserver. The only issue is with AppMaster REST API.
>>
>> Regards,
>> Prajakta
>>
>>
>>
>> On Fri, Jul 6, 2012 at 10:55 PM, Robert Evans <ev...@yahoo-inc.com>wrote:
>>
>>> What version of hadoop are you using?  It could be that the version you
>>> have does not have the RESTful APIs in it yet, and the proxy is working
>>> just fine.
>>>
>>> --Bobby Evans
>>>
>>> On 7/6/12 12:06 PM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote:
>>>
>>> >Robert , Thanks for the response. If I do not provide any explicit
>>> >configuration for the proxy server, do I still need to start it using
>>> the
>>> >'yarn start proxy server'? I am currently not doing it.
>>> >
>>> >Also, I am able to access the html page for proxy using the
>>> ><http://localhost:8088/proxy/{appid}/mapreduce/jobs> URL. (Note this
>>> url
>>> >does not have the '/ws/v1/ part in it. I get the html response when I
>>> >query
>>> >for this URL in runtime.
>>> >
>>> >So I assume the proxy server must be starting fine since I am able to
>>> >access this URL. I will try logging more details tomorrow from my office
>>> >machine and will let you know the result.
>>> >
>>> >Regards,
>>> >Prajakta
>>> >
>>> >
>>> >
>>> >On Fri, Jul 6, 2012 at 10:22 PM, Robert Evans <ev...@yahoo-inc.com>
>>> wrote:
>>> >
>>> >> Sorry I did not respond sooner.  The default behavior is to have the
>>> >>proxy
>>> >> server run as part of the RM.  I am not really sure why it is not
>>> doing
>>> >> this in your case.  If you set the config yourself to be a URI that is
>>> >> different from that of the RM then you need to launch a standalone
>>> proxy
>>> >> server.  You can do this by running
>>> >>
>>> >> yarn start proxy server
>>> >>
>>> >> Without sitting down with you it is going to be somewhat difficult to
>>> >> debug why this is happening.  However, in retrospect it would be nice
>>> to
>>> >> add in some extra logging to help indicate why the proxy server is not
>>> >> functioning as desired.  If you could file a JIRA to add in the
>>> logging
>>> >>I
>>> >> would be happy to provide a patch to you and we can try and debug the
>>> >> issue further.  Please file it under the MAPREDUCE JIRA project.
>>> >>
>>> >> --Bobby
>>> >>
>>> >> On 7/6/12 3:29 AM, "Prajakta Kalmegh" <pkalm...@gmail.com> wrote:
>>> >>
>>> >> >Re-posting as I haven't got a solution yet. Sorry for spamming. I
>>> >>won't be
>>> >> >able to proceed in my code until I get a JSON response using
>>> AppMaster
>>> >> >REST
>>> >> >URL. :(
>>> >> >
>>> >> >Thanks,
>>> >> >Prajakta
>>> >> >
>>> >> >
>>> >> >On Wed, Jul 4, 2012 at 5:55 PM, Prajakta Kalmegh <pkalm...@gmail.com
>>> >
>>> >> >wrote:
>>> >> >
>>> >> >> Hi Robert/Harsh
>>> >> >>
>>> >> >> Thanks for your reply.
>>> >> >>
>>> >> >> My RM is starting just fine. The problem is with the use of
>>> >> >>http://<proxy httpddress:port>/proxy/{appid}/ws/v1/mapreduce
>>> >> >> to get the JSON response.
>>> >> >>
>>> >> >> As I said before, I had not configured the yarn.web-proxy.address
>>> >> >>property in yarn-site.xml. I assumed it will use the RM's
>>> >> >>yarn.resourcemanager.webapp.address property value as default.
>>> >>However,
>>> >> >>it gives me a '404-Page not found error'.  Today I tried specifying
>>> a
>>> >> >>value explicitly for the yarn.web-proxy.address property.
>>> >> >>
>>> >> >> On running the wordcount example, it even gives a url
>>> >> >><http://localhost:8090>/proxy/{appid}/> to track the App Mast info.
>>> >> >>However, I am still not able to get a json response.
>>> >> >>
>>> >> >> Also, I tried to get the data from historyserver instead of runtime
>>> >> >>using the instructions given on page
>>> >> >><
>>> >>
>>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar
>>> >> >>n-site/HistoryServerRest.html>
>>> >> >>
>>> >> >> HistoryServer REST response does not give me jobids corresponding
>>> to
>>> >>an
>>> >> >>application. It just lists all the jobs run until now. By the way,
>>> the
>>> >> >>documentation does say
>>> >> >>
>>> >> >> ----------
>>> >> >>
>>> >> >> "Both of the following URI's give you the history server
>>> information,
>>> >> >>from an application id identified by the appid value.
>>> >> >>   * http://<history server http address:port>/ws/v1/history
>>> >> >>   * http://<history server http address:port>/ws/v1/history/info"
>>> >> >> ---------
>>> >> >>
>>> >> >> But there is no provision to specify the application id with these
>>> >>REST
>>> >> >>URLs.
>>> >> >>
>>> >> >> Any idea how I can get the Application Master REST working and also
>>> >> >>linking jobids to application id using the HistoryServerREST API?
>>> >> >>
>>> >> >> Any help is appreciated. Thanks in advance.
>>> >> >> Regards,
>>> >> >> Prajakta
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> On Fri, Jun 29, 2012 at 8:55 PM, Robert Evans <ev...@yahoo-inc.com
>>> >
>>> >> >>wrote:
>>> >> >>
>>> >> >>> Please don't file that JIRA.  The proxy server is intended to
>>> front
>>> >>the
>>> >> >>> web server for all calls to the AM.  This is so you only have to
>>> go
>>> >>to
>>> >> >>>a
>>> >> >>> single location to get to any AM's web service.  The proxy server
>>> >>is a
>>> >> >>> very simple proxy and just forwards the extra part of the path on
>>> to
>>> >> >>>the
>>> >> >>> AM.
>>> >> >>>
>>> >> >>> If you are having issues with this please include the version you
>>> >>are
>>> >> >>> having problems with.  Also please look at the logs for the RM on
>>> >> >>>startup
>>> >> >>> to see if there is anything there indicating why it is not
>>> starting
>>> >>up.
>>> >> >>>
>>> >> >>> --Bobby Evans
>>> >> >>>
>>> >> >>> On 6/28/12 9:46 AM, "Harsh J" <ha...@cloudera.com> wrote:
>>> >> >>>
>>> >> >>> >As far as I can tell, the MR WebApp, as the name itself indicates
>>> >>on
>>> >> >>> >its doc page, starts only at the MR AM (which may be running at
>>> any
>>> >> >>> >NM), and it starts as an ephemeral port logged at in the AM logs
>>> >> >>> >usually as:
>>> >> >>> >
>>> >> >>> >INFO Web app /mapreduce started at [PORT]
>>> >> >>> >
>>> >> >>> >That it starts its own server with an ephemeral access point
>>> makes
>>> >> >>> >sense, since each job uses its own AM and having a common
>>> location
>>> >>may
>>> >> >>> >not work with the form of REST API documented at your link. Can
>>> you
>>> >> >>> >please file a JIRA to fix the doc and remove the proxy server
>>> refs,
>>> >> >>> >which are misleading?
>>> >> >>> >
>>> >> >>> >Do correct me if I'm wrong.
>>> >> >>> >
>>> >> >>> >On Thu, Jun 28, 2012 at 6:13 PM, Prajakta Kalmegh
>>> >><pkalm...@gmail.com
>>> >> >
>>> >> >>> >wrote:
>>> >> >>> >> Hi
>>> >> >>> >>
>>> >> >>> >> I am trying to get the ApplicationMaster info using the
>>> >> >>><http://<proxy
>>> >> >>> >>http
>>> >> >>> >> address:port>/proxy/{appid}/ws/v1/mapreduce/info> link as
>>> >>described
>>> >> >>>on
>>> >> >>> >>the <
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>>
>>> >> >>>
>>> >>
>>> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yar
>>> >> >>>n
>>> >> >>> >>-site/MapredAppMasterRest.html>
>>> >> >>> >> page.
>>> >> >>> >>
>>> >> >>> >> I am able to access and retrieve JSON response for other
>>> modules
>>> >> >>> >> (ResourceManager, NodeManager and HistoryServer). However, I am
>>> >> >>>getting
>>> >> >>> >> 'Page not found' when I try to use my ResourceManager Http
>>> >>address
>>> >> >>>to
>>> >> >>> >> access the ApplicationMaster info. I am using <
>>> >> >>> >> http://localhost:8088/proxy/{appid}/ws/v1/mapreduce/info> to
>>> >> >>>retrieve
>>> >> >>> >>JSON
>>> >> >>> >> response.
>>> >> >>> >>
>>> >> >>> >> The instructions say "The application master should be accessed
>>> >>via
>>> >> >>>the
>>> >> >>> >> proxy. This proxy is configurable to run either on the resource
>>> >> >>>manager
>>> >> >>> >>or
>>> >> >>> >> on a separate host."
>>> >> >>> >>
>>> >> >>> >> My yarn-default.xml contains:
>>> >> >>> >>  <property>
>>> >> >>> >>    <description>The address for the web proxy as HOST:PORT, if
>>> >>this
>>> >> >>>is
>>> >> >>> >>not
>>> >> >>> >>     given then the proxy will run as part of the
>>> RM</description>
>>> >> >>> >>     <name>yarn.web-proxy.address</name>
>>> >> >>> >>     <value/>
>>> >> >>> >>  </property>
>>> >> >>> >>
>>> >> >>> >> and I did not set a value explicitly in yarn-site.xml.  Any
>>> idea
>>> >> >>>how I
>>> >> >>> >>can
>>> >> >>> >> get this working? Thanks in advance.
>>> >> >>> >>
>>> >> >>> >> Regards,
>>> >> >>> >> Prajakta
>>> >> >>> >
>>> >> >>> >
>>> >> >>> >
>>> >> >>> >--
>>> >> >>> >Harsh J
>>> >> >>>
>>> >> >>>
>>> >> >>
>>> >>
>>> >>
>>>
>>>
>>
>

Reply via email to