thanks Ken, I just tried that and still get the same results, the mtv url is not the final url. Any other ideas on what could be wrong?
thanks On Wed, Sep 8, 2010 at 12:00 PM, Ken Krugler <[email protected]>wrote: > > On Sep 8, 2010, at 11:42am, Jim wrote: > > I found a URL that httpclient doesn't seem to be handling redirects on: >> >> >> http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGrJk-F7Dmshmtze2yhifxRsv8sRg&url=http://www.mtv.com/news/articles/1647243/20100907/story.jhtml >> >> should 302 to: >> http://www.mtv.com/news/articles/1647243/20100907/story.jhtml >> >> when I look at the headers in the browser everything looks good: >> >> HTTP/1.1 302 Moved Temporarily >> Content-Type: text/html; charset=UTF-8 >> Location: http://www.mtv.com/news/articles/1647243/20100907/story.jhtml >> Content-Length: 258 >> Date: Wed, 08 Sep 2010 18:40:21 GMT >> Expires: Wed, 08 Sep 2010 18:40:21 GMT >> Cache-Control: private, max-age=0 >> X-Content-Type-Options: nosniff >> X-Frame-Options: SAMEORIGIN >> X-Xss-Protection: 1; mode=block >> Server: GSE >> Set-Cookie: >> PREF=ID=024209255b405b06:TM=1283971221:LM=1283971221:S=AG-13_7Cjg_EqlRY; >> expires=Fri, 07-Sep-2012 18:40:21 GMT; path=/; domain=.google.com >> Connection: close >> >> However httpclient doesn't seem to give me the final URL. Here is the code >> I >> was using >> >> >> >> HttpHead httpget = null; >> HttpHost target = null; >> HttpUriRequest req = null; >> >> String startURL = " >> >> http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGrJk-F7Dmshmtze2yhifxRsv8sRg&url=http://www.mtv.com/news/articles/1647243/20100907/story.jhtml >> "; >> HttpContext localContext = new BasicHttpContext(); >> >> localContext.setAttribute(ClientContext.COOKIE_STORE,HttpClientFetcher.emptyCookieStore); >> httpget = new HttpHead(startURL); >> >> HttpResponse response = httpClient.execute(httpget, localContext); >> >> Header[] test = response.getAllHeaders(); >> for(Header h: test) { >> logger.info(h.getName()+ ": "+h.getValue()); >> } >> >> target = (HttpHost) localContext.getAttribute( >> ExecutionContext.HTTP_TARGET_HOST ); >> >> req = (HttpUriRequest) localContext.getAttribute( >> ExecutionContext.HTTP_REQUEST ); >> >> // STILL PRINTS OUT THE GOOGLE NEWS LINK >> finalURL = target+""+req.getURI(); >> >> >> >> Am I doing something wrong? thanks >> > > > I think you need to explicitly get the URI from the host, and then combine > with the final request - or at least this code below is how I'm doing it > (and it seems to work), Oleg can probably improve on it... > > HttpHost host = > (HttpHost)localContext.getAttribute(ExecutionContext.HTTP_TARGET_HOST); > HttpUriRequest finalRequest = > (HttpUriRequest)localContext.getAttribute(ExecutionContext.HTTP_REQUEST); > > try { > URL hostUrl = new URI(host.toURI()).toURL(); > return new URL(hostUrl, > finalRequest.getURI().toString()).toExternalForm(); > > -- Ken > > -------------------------------------------- > <http://ken-blog.krugler.org> > +1 530-265-2225 > > > > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > e l a s t i c w e b m i n i n g > > > > > >
