On Saturday, May 4, 2013 11:46:13 AM UTC-5, Phil wrote:

>  Is there anyway to have UrlFetch grab the html that loads via javascript?
>
>
No, there isn't. That would require UrlFetch to interpret the javascript, 
and that's not a job for the URLFetch service. You would have to parse the 
javascript yourself, find the URL that the Javascript is loading, then call 
URLFetch to load that URL.


On Saturday, May 4, 2013 11:46:13 AM UTC-5, Phil wrote:
>
> Specifically, I'm trying to grab this page: 
> http://www.groupon.com/browse/san-francisco?category=restaurants-and-bars
>
> Any idea how to get the html that loads via js in the middle of the page?
>
>
If I was doing this, I'd use a HTML parser like Python's Beautiful Soup ( 
http://www.crummy.com/software/BeautifulSoup/ ) or Java HTMLUnit ( 
http://htmlunit.sourceforge.net/ ) to interpret the downloaded page. 
HTMLUnit can interpret Javascript, so you can use it to load pages that are 
pulled in via JS. Then just go through the DOM and find what you need. 


-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

My Go side project: http://invalidmail.com/


-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to