On Saturday, May 4, 2013 11:46:13 AM UTC-5, Phil wrote: > Is there anyway to have UrlFetch grab the html that loads via javascript? > > No, there isn't. That would require UrlFetch to interpret the javascript, and that's not a job for the URLFetch service. You would have to parse the javascript yourself, find the URL that the Javascript is loading, then call URLFetch to load that URL.
On Saturday, May 4, 2013 11:46:13 AM UTC-5, Phil wrote: > > Specifically, I'm trying to grab this page: > http://www.groupon.com/browse/san-francisco?category=restaurants-and-bars > > Any idea how to get the html that loads via js in the middle of the page? > > If I was doing this, I'd use a HTML parser like Python's Beautiful Soup ( http://www.crummy.com/software/BeautifulSoup/ ) or Java HTMLUnit ( http://htmlunit.sourceforge.net/ ) to interpret the downloaded page. HTMLUnit can interpret Javascript, so you can use it to load pages that are pulled in via JS. Then just go through the DOM and find what you need. ----------------- -Vinny P Technology & Media Advisor Chicago, IL My Go side project: http://invalidmail.com/ -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/google-appengine?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
