Thank you that helped a lot. __________________________________________ ~Sward
On Thu, Jun 4, 2009 at 11:14 AM, Charles François Rey <[email protected] > wrote: > Here's an example: > --- > // bare minimum, lots of ways to improve how things are handled > DefaultHttpClient httpclient = new DefaultHttpClient(); > String url = " > http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols"; > HttpUriRequest request = new HttpGet(url); > HttpResponse response = httpclient.execute(request); > HtmlCleaner cleaner = new HtmlCleaner(); > // note that HtmlCleaner is capable to download a URL, but let's assume > // that you need httpclient to do it .. (e.g. POST request, > // special settings, ...) > TagNode rootNode = cleaner.clean(response.getEntity().getContent()); > Document doc = (new DomSerializer(cleaner.getProperties(), > true)).createDOM(rootNode); > // we're just going to display the target urls of the frames > XPath xpath = XPathFactory.newInstance().newXPath(); > // XPath is very useful when dealing with HTML/XML .. > NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc, > XPathConstants.NODESET); > for(int i = 0; i<nodes.getLength(); i++) { > System.out.println(nodes.item(i).getNodeValue()); > } > --- > > This example should display the 3 frames of this example at the given URL, > i.e.: > frame_a.htm > frame_b.htm > frame_c.htm > > Those are relative paths, so you would have to prefix with the correct > basepath to fetch them. > > Note on the packages used: XPath comes from the standard package > javax.xml.xpath, and the HtmlCleaner library comes from > http://htmlcleaner.sourceforge.net/, Document and NodeList come from the > standard package org.w3c.dom. > > > On 4 juin 09, at 17:12, Charles François Rey wrote: > > Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML. >> >> That being said, it is possible to follow Framesets, just download the >> HTML file, parse it and follow the Frame definitions. >> >> If I had to do it, I'd use HttpClient to retrieve the HTML, >> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src" >> attributes. >> >> On 4 juin 09, at 16:44, Scott Ward wrote: >> >> Is it possible to follow framesets using HttpClient? I have >>> searched all >>> over and haven't found anything so I thought that I would try this. >>> >>> If it is can you direct me to it in the API or show an example. >>> >>> Any help is greatly appreciated. >>> __________________________________________ >>> >>> ~Sward >>> >> > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
