Thank you that helped a lot.
__________________________________________

~Sward


On Thu, Jun 4, 2009 at 11:14 AM, Charles François Rey <[email protected]
> wrote:

> Here's an example:
> ---
> // bare minimum, lots of ways to improve how things are handled
> DefaultHttpClient httpclient = new DefaultHttpClient();
> String url = "
> http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols";;
> HttpUriRequest request = new HttpGet(url);
> HttpResponse response = httpclient.execute(request);
> HtmlCleaner cleaner = new HtmlCleaner();
> // note that HtmlCleaner is capable to download a URL, but let's assume
> // that you need httpclient to do it .. (e.g. POST request,
> // special settings, ...)
> TagNode rootNode = cleaner.clean(response.getEntity().getContent());
> Document doc = (new DomSerializer(cleaner.getProperties(),
> true)).createDOM(rootNode);
> // we're just going to display the target urls of the frames
> XPath xpath = XPathFactory.newInstance().newXPath();
> // XPath is very useful when dealing with HTML/XML ..
> NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc,
> XPathConstants.NODESET);
> for(int i = 0; i<nodes.getLength(); i++) {
>        System.out.println(nodes.item(i).getNodeValue());
> }
> ---
>
> This example should display the 3 frames of this example at the given URL,
> i.e.:
>        frame_a.htm
>        frame_b.htm
>        frame_c.htm
>
> Those are relative paths, so you would have to prefix with the correct
> basepath to fetch them.
>
> Note on the packages used: XPath comes from the standard package
> javax.xml.xpath, and the HtmlCleaner library comes from
> http://htmlcleaner.sourceforge.net/, Document and NodeList come from the
> standard package org.w3c.dom.
>
>
> On 4 juin 09, at 17:12, Charles François Rey wrote:
>
>  Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML.
>>
>> That being said, it is possible to follow Framesets, just download the
>> HTML file, parse it and follow the Frame definitions.
>>
>> If I had to do it, I'd use HttpClient to retrieve the HTML,
>> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src"
>> attributes.
>>
>> On 4 juin 09, at 16:44, Scott Ward wrote:
>>
>>  Is it possible to follow framesets using HttpClient?  I have
>>> searched all
>>> over and haven't found anything so I thought that I would try this.
>>>
>>> If it is can you direct me to it in the API or show an example.
>>>
>>> Any help is greatly appreciated.
>>> __________________________________________
>>>
>>> ~Sward
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to