it is possible but you will be blocked or get a security warning if
you access a URL outside your site. You might wonder 'why would I
scrap my own site?', well ask my boss that want to index all pages
from our intranet.

If you use this code to get the content
public native String getIFRAMEBodyContent(String iframeId); /*-{
    return
document.getElementById(iframeid).contentWindow.document.body.innerHTML);
}-*/

once you have the HTML with the content you can wrap it with an HTML
object:
HTML html = new HTML(getIFRAMEBodyContent("myIframe"));
Element rootElement = html.getElement();
// be happy

In case you must scrap pages outside your domain and this must be done
in the browser, you can use a Signed Java Applet (would be a great
exercise of your java knowledge).

Anyway, the easiest way would be with server side code as *lineman78*
and *cokol* said.

Cheers,
Henrique Viecili
--
Think outside the box, limitations are self imposed!

On Aug 12, 3:12 pm, cokol <[email protected]> wrote:
> nope, thats not possible - u cannot access JS namespace of an iframe,
> so serverside is the only way but you can bring up results into the
> client though
>
> On 12 Aug., 14:35, Henrique Viecili <[email protected]> wrote:
>
> > hmmm... you could use IFRAME to load the page, some JSNI to get the
> > HTML from the IFRAME (you might get a security warning or even be
> > blocked), after you have the HTML you just use DOM support on GWT to
> > do the thing.
>
> > but should be much easier if you use any server side language to do
> > that for you
>
> > On Aug 10, 6:09 pm, lineman78 <[email protected]> wrote:
>
> > > First of all GWT is executed client side and therefore XSRF security
> > > should prevent you from scraping another site directly.  However, you
> > > can do scraping quite easily with server-side java.  PHP is also a
> > > server executed language, so anything you would usually do in php, you
> > > will do it via server side java with GWT.  There are a few different
> > > ways you can scrape a page in java.
>
> > > 1) External Libraries (JScrape, XQuery)
> > > 2) Parse the HTML as XML (DOM or SAX)
> > > 3) Regex
>
> > > These all require you to get the HTML page as a string which is rather
> > > easy (see URL.openConnection)
>
> > > On Aug 10, 6:48 am, Fermin <[email protected]> wrote:
>
> > > > Hi,
>
> > > > I don't found any reference to do scraping with GWT, is posible ? Like
> > > > CURL in php ?
>
> > > > Thx 4 all

-- 
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-web-toolkit?hl=en.

Reply via email to